Bug 1917013 - yum excluders are not set back in nodes after upgrade when it is done in separate phases and "openshift_upgrade_nodes_label" parameter is used to filter which nodes should be upgraded at a time
Summary: yum excluders are not set back in nodes after upgrade when it is done in sepa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: Unspecified
OS: All
low
low
Target Milestone: ---
: 3.11.z
Assignee: Russell Teague
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On: 1933090
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-16 14:38 UTC by Joel Rosental R.
Modified: 2024-06-13 23:56 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The node upgrade playbooks scoped all nodes instead of just nodes filtered by openshift_upgrade_nodes_label. Consequence: Nodes that were not intended to be upgraded had excluders disabled but not reenabled. Fix: Moved the initialization of the variable for filtering nodes to upgrade earlier in the play and scoped the pre/config to the filtered list of nodes. Result: Only nodes intended for upgrade have yum excluders disabled.
Clone Of:
Environment:
Last Closed: 2021-03-25 09:50:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12311 0 None open Bug 1917013: Evaluate oo_nodes_to_upgrade before pre/config 2021-02-25 16:46:59 UTC
Red Hat Product Errata RHSA-2021:0833 0 None None None 2021-03-25 09:50:22 UTC

Description Joel Rosental R. 2021-01-16 14:38:56 UTC
Version:

$ rpm -q openshift-ansible
openshift-ansible-3.11.318-2.git.1.da17c54.el7.noarch

$ rpm -q ansible
ansible-2.9.16-1.el7ae.noarch


What happened?

While upgrading an OCP 3.11 cluster in different phases (control plane and nodes), when the nodes upgrade playbook is invoked with "openshift_upgrade_nodes_label" parameter to specify that only nodes with a certain label are upgraded, excluders are disabled in all nodes at the beggining of the playbook execution but are not set back.

Only the node specified in the "openshift_upgrade_nodes_label" parameter has proper excluders set after playbook execution.

What did you expect to happen?

Excluders should be set back in all nodes after playbook finishes.

How to reproduce it (as minimally and precisely as possible)?

Perform an upgrade from OCP in separate phases (control plane & nodes), and in the nodes phase, launch the playbook with the "openshift_upgrade_nodes_label" parameter set to one of the nodes.

Comment 3 Matthew Staebler 2021-02-15 19:25:49 UTC
As there is a workaround for this and it is not blocking installs or upgrades, we will not be pursuing a fix.

Comment 4 Victor Medina 2021-02-25 11:54:28 UTC
@mstaeble Hello, would you point me out to the workaround link? Thanks

Comment 5 Russell Teague 2021-02-25 14:31:41 UTC
The excluders are present to prevent the upgrade of openshift packages during an OS upgrade.  Following the OS upgrade steps [1] include ensuring the excluders are enabled prior to OS upgrades.

From Step 2:
# atomic-openshift-docker-excluder exclude
# atomic-openshift-excluder exclude


[1] https://docs.openshift.com/container-platform/3.11/upgrading/os_upgrades.html

Comment 6 Victor Medina 2021-02-25 16:34:33 UTC
Thanks

Comment 7 Russell Teague 2021-02-25 16:45:59 UTC
When investigating this issue, I found that using openshift_upgrade_nodes_label was broken in the most recent code.  I've opened a bug for that issue, https://bugzilla.redhat.com/show_bug.cgi?id=1933090.  Since I was already fixing that issue, I also worked up a fix for this bug and will submit a patch shortly.

Comment 9 Gaoyun Pei 2021-03-13 15:01:04 UTC
Could reproduce this issue with openshift-ansible-3.11.318-1.git.0.bccee5b.el7.noarch.rpm

Step "disable openshift excluder" was executed on all the nodes, but only enabled openshift excluder on the node matched "openshift_upgrade_nodes_label" in the end of playbook playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_nodes.yml.

22:30:10  TASK [openshift_excluder : disable openshift excluder] *************************
22:30:11  [0;33mchanged: [ci-vm-10-0-148-147.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["/sbin/atomic-openshift-excluder", "unexclude"], "delta": "0:00:00.054883", "end": "2021-03-13 09:30:10.031549", "rc": 0, "start": "2021-03-13 09:30:09.976666", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}[0m
22:30:11  [0;33mchanged: [ci-vm-10-0-150-233.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["/sbin/atomic-openshift-excluder", "unexclude"], "delta": "0:00:00.069715", "end": "2021-03-13 09:30:10.230393", "rc": 0, "start": "2021-03-13 09:30:10.160678", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}[0m

...

22:36:07  TASK [openshift_excluder : Enable openshift excluder] **************************
22:36:08  [0;33mchanged: [ci-vm-10-0-150-233.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["/sbin/atomic-openshift-excluder", "exclude"], "delta": "0:00:00.075231", "end": "2021-03-13 09:36:07.115162", "rc": 0, "start": "2021-03-13 09:36:07.039931", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}[0m



Verified with openshift-ansible-3.11.400-1.git.0.3f4fe20.el7.noarch.rpm, openshift excluder would be disabled on only the node matched "openshift_upgrade_nodes_label".

 TASK [openshift_excluder : disable openshift excluder] *************************
22:44:02 
 changed: [ci-vm-10-0-150-233.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["/sbin/atomic-openshift-excluder", "unexclude"], "delta": "0:00:00.068328", "end": "2021-03-13 09:44:00.608174", "rc": 0, "start": "2021-03-13 09:44:00.539846", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

Comment 12 errata-xmlrpc 2021-03-25 09:50:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 3.11.404 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0833


Note You need to log in before you can comment on or make changes to this bug.