Bug 2035005 - MCD is not always removing in progress taint after a successful update
Summary: MCD is not always removing in progress taint after a successful update
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: Yu Qi Zhang
QA Contact: Rio Liu
URL:
Whiteboard:
: 2034901 (view as bug list)
Depends On:
Blocks: 2076308
TreeView+ depends on / blocked
 
Reported: 2021-12-22 17:22 UTC by Simone Tiraboschi
Modified: 2023-09-18 04:29 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2076308 2102069 (view as bug list)
Environment:
Last Closed: 2022-08-10 10:41:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 3064 0 None Merged Bug 2035005: Move removeUpdateInProgressTaint functionality to mcc 2022-04-22 18:29:02 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:41:27 UTC

Internal Links: 2247784

Description Simone Tiraboschi 2021-12-22 17:22:52 UTC
Description of problem:

Version-Release number of MCO (Machine Config Operator) (if applicable):

Platform (AWS, VSphere, Metal, etc.):

Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
(Y/N/Not sure): Y

How reproducible: ? (not on all the nodes)

Did you catch this issue by running a Jenkins job? If yes, please list:
1. Jenkins job: https://main-jenkins-csb-cnvqe.apps.ocp-c1.prod.psi.redhat.com/view/Upgrade-Pipelines/job/Upgrade-CNV-4.10-Scheduled/17/

2. Profile: ?

Steps to Reproduce:
1. during our tests we apply an ICSP that causes a new config to be rendered for our nodes
2. MCP starts applying it
3.

Actual results:
at the end of the process a few nodes (not all of them) still contains:

        "taints": [
            {
                "effect": "PreferNoSchedule",
                "key": "UpdateInProgress"
            }
        ]



Expected results:
after a successful update UpdateInProgress taint is correctly removed

Additional info:

on the node we see:
            "machineconfiguration.openshift.io/currentConfig": "rendered-worker-f283e2dd057330b8c4d288348ae5b5cb",
            "machineconfiguration.openshift.io/desiredConfig": "rendered-worker-f283e2dd057330b8c4d288348ae5b5cb",

on MCD logs:
I1222 06:34:39.073411    3396 update.go:1956] Update completed for config rendered-worker-f283e2dd057330b8c4d288348ae5b5cb and node has been successfully uncordoned
I1222 06:34:39.120947    3396 daemon.go:1283] In desired config rendered-worker-f283e2dd057330b8c4d288348ae5b5cb

but the node is still tainted.

Comment 2 Sinny Kumari 2021-12-22 17:27:39 UTC
Ravi, can you please look at this bug as this seems like regression from PR https://github.com/openshift/machine-config-operator/pull/2686

Comment 3 Simone Tiraboschi 2021-12-22 17:28:58 UTC
in machine-config-controller logs we see:

I1222 06:42:29.011398       1 status.go:90] Pool worker: All nodes are updated with rendered-worker-f283e2dd057330b8c4d288348ae5b5cb

Comment 4 Simone Tiraboschi 2021-12-22 17:29:50 UTC
*** Bug 2034901 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2022-08-10 10:41:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 18 Red Hat Bugzilla 2023-09-18 04:29:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.