Bug 2171801 - virt-operator performs redundant reconciliations to its operands
Summary: virt-operator performs redundant reconciliations to its operands
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.13.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.15.0
Assignee: Orel Misan
QA Contact: Kedar Bidarkar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-20 13:18 UTC by Igor Bezukh
Modified: 2023-11-10 06:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-10 06:43:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CNV-25915 0 None None None 2023-02-20 13:20:02 UTC

Description Igor Bezukh 2023-02-20 13:18:17 UTC
Description of problem:
There is a subset of configuration fields in Kubevirt CR which define controller behavior rather than deployment desired state. For example the migration configuration doesn't require any reconciliation of the virt-controller. The controller will read the updated bits upon its regular execution cycles.

Nevertheless, when changing such configuration, virt-operator performs reconciliation. The operator patches the virt-api, virt-controller, PDB and webhook resources with a new annotation each time. 

This is redundant, since the generation number should reflect the changes made to the resources, and not the internal runtime configuration.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Edit the Kubevirt CR. Update the following: 
   - spec.configration.migrations.parallelMigrationsPerCluster: 100
2. Observe the the virt-api deployment resource. kubevirt.io/generation will increase. Also metadata.generation will increase
3. Observe the Kubevirt CR status. The Last generation of the above-mentioned operands will increase.

Actual results:
Generation annotations of operands is being bumped

Expected results:
Generation annotation should be bumped only when desired state changed.


Additional info:

Comment 1 Antonio Cardace 2023-09-11 09:48:55 UTC
Deferring to 4.15 due to capacity reasons.

Comment 2 Orel Misan 2023-11-02 15:15:23 UTC
After an offline discussion is was decided that the expected behavior should be defined.

Comment 3 Igor Bezukh 2023-11-10 06:43:40 UTC
For CRDs Kubernetes has by-design behavior of incrementing the objectMeta generation number each time the CR spec is changed. With that said, when we change something in Kubevirt CR spec its generation number will be increased. In addition, each time the Kubevirt CR is modified the virt-operator sync happens. 

As part of the sync the reconciler flow takes place. In order to figure out whether reconciliation of an operand needs to take place, the reconciler compares the operand as it appears in the install strategy vs the operand as it seen from the client cache store. 

Install strategy is not the single source of truth with regards to how an operand should be deployed. In some cases, at day 2 after Kubevirt was deployed the administrator needs to adjust deployment of some operands. For example, in a large scale cluster it’s not enough to have only 2 replicas of virt-api, therefore Kubevirt CR allows the administrator to manually set the number of replicas. It is the reconciler's responsibility to re-deploy the virt-api with the custom number of replicas. In order to do so the reconciler logic has injection methods, which inject custom configuration of an operand on top of the one from the install strategy, which then helps to understand the need to reconcile the virt-api deployment since it's not equal to the one from the client cache.

Comparing the client cache also helps to determine the need to restore the operand to its original state, when someone modified the operand directly.

Custom injection can be related to operand spec, but also to operand metadata. With regards to metadata, the reconciler injects the kubevirt.io/generation annotation to each operand it manages. That annotation reflects the current Kubevirt CR generation. 

Now, as we said earlier the generation of a CRD is changed each time its spec is changed. Therefore the reconciler will modify the generation annotation of every operand each time the Kubevirt CR spec is modified. 

Some operands have a generation number of their own. That generation number can be increased when operand metadata is changed. For example, when you modify the metadata of a deployment, its generation number is increased, because the k8s deployment controller propagates the deployment metadata to its corresponding respliaSet. For other operands the generation number is not used at all, or it can be always set to 1.

So each time the Kubevirt CR is changed and the reconciler modifies the annotation, some operands have the side effect of their generation number being also increased.

To summarize, the reconciliation is not redundant. The usage of kubevirt.io/generation annotation seems redundant at first sight, but it also might help with reconciler debugging, when we can compare 3 numbers: Kubevirt CR generation number in the metadata, Kubevirt CR observed generation number, and the number that appears in the operand metadata


Note You need to log in before you can comment on or make changes to this bug.