Bug 2095707 - drain annotations are updated when no drain is executed
Summary: drain annotations are updated when no drain is executed
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.11
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Yu Qi Zhang
QA Contact: Sergio
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-10 10:12 UTC by Sergio
Modified: 2022-10-24 16:28 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-10-24 16:28:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Sergio 2022-06-10 10:12:33 UTC
Description of problem:
When we apply a MC that does not trigger a drain execution, the nodes are annotated as drained anyway.


Version-Release number of MCO (Machine Config Operator) (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-06-201913   True        False         62m     Cluster version is 4.11.0-0.nightly-2022-06-06-201913

Platform (AWS, VSphere, Metal, etc.):

Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
(Y/N/Not sure): Yes

How reproducible:

Did you catch this issue by running a Jenkins job? If yes, please list:
1. Jenkins job:

2. Profile:

Steps to Reproduce:
1. Get current annotated drains

$ oc get node -l node-role.kubernetes.io/worker  -o jsonpath='{.items[0].metadata.annotations.machineconfiguration\.openshift\.io/desiredDrain}'
uncordon-rendered-worker-cd52afd4bd39d302834b215fcc978be8

It should be the latest rendered machine config that triggered a drain
$ oc get mc| grep worker | grep render
rendered-worker-cd52afd4bd39d302834b215fcc978be8   f5950ed0b5e5468fd172b37cef4a8f34995a3b3f   3.2.0             84m


2. Create a new MachineConfig resource that should not trigger a drain execution. We can, for example, create an ICSP

apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  name: ubi8repo
spec:
  repositoryDigestMirrors:
  - mirrors:
    - example.io/example/ubi-minimal
    source: registry.access.redhat.com/ubi8/ubi-minimal
  - mirrors:
    - example.com/example/ubi-minimal
    source: registry.access.redhat.com/ubi8/ubi-minimal



3. A new rendered machine config is generated
$ oc get mc| grep worker | grep render
rendered-worker-cd52afd4bd39d302834b215fcc978be8   f5950ed0b5e5468fd172b37cef4a8f34995a3b3f   3.2.0             86m
rendered-worker-dc596e0454284758c260b1de37796675   f5950ed0b5e5468fd172b37cef4a8f34995a3b3f   3.2.0             1s

this new rendered config (rendered-worker-dc596e0454284758c260b1de37796675) does NOT trigger a drain execution

We can see these log messages in the daemon pods logs
I0610 09:53:29.397019    2072 drain.go:237] /etc/containers/registries.conf: changes made are safe to skip drain
I0610 09:53:29.397030    2072 update.go:544] Changes do not require drain, skipping.


Actual results:
Even if the new rendered machine config did not trigger a drain execution, the nodes are annotated as drained for this machine config

$ oc get node -l node-role.kubernetes.io/worker  -o jsonpath='{.items[0].metadata.annotations.machineconfiguration\.openshift\.io/desiredDrain}'
uncordon-rendered-worker-dc596e0454284758c260b1de37796675


Expected results:
The node annotations should display the actual drain execution


Additional info:

Comment 1 Yu Qi Zhang 2022-06-20 22:28:00 UTC
Marking low since this is technically unchanged behaviour.

The annotation indicates that an uncordon happened after the update was completed. This will always happen regardless if the drain has happened or not, so we requested a uncordon and then the controller did so (it's a no-op in this case, but just to make sure).

I'm pretty sure we did this because at the very beginning of the no-reboot update implementation, for all updates, we'd cordon the nodes before doing the calculations (this is no longer the case).

We can definitely skip this step. Longer term, the MCO probably shouldn't uncordon without checking who applied the cordon (something we don't track today at all, no metadata for this is tracked anywhere).

Comment 2 Yu Qi Zhang 2022-10-24 16:28:45 UTC
After discussion, we will be closing this as NOTABUG. The behaviour is unchanged and the annotation update is just cosmetic. We will reconsider proper cordon behaviour at a later time.


Note You need to log in before you can comment on or make changes to this bug.