Bug 2037620 - Upgrade playbook should quit directly when trying to upgrade RHEL-7 workers to 4.10
Summary: Upgrade playbook should quit directly when trying to upgrade RHEL-7 workers t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Jeremiah Stuever
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks: 2053651
TreeView+ depends on / blocked
 
Reported: 2022-01-06 06:48 UTC by Gaoyun Pei
Modified: 2022-08-10 10:41 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:41:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12369 0 None open Bug 2037620: Require RHEL v8.4 or newer beginning with OpenShift v4.10 2022-02-09 06:41:21 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:41:49 UTC

Description Gaoyun Pei 2022-01-06 06:48:35 UTC
Version:
openshift-ansible-4.10.0-202112151834.p0.gcc445ce.assembly.stream.el7.noarch.rpm

What happened?
As our official doc clarified the RHEL-7 workers supposed to be removed in 4.9[1]
"You cannot upgrade RHEL 7 compute machines to RHEL 8. You must deploy new RHEL 8 hosts, and the old RHEL 7 hosts should be removed."
but there's still some cases that customer would try to upgrade their RHEL-7 workers to 4.10 from 4.9 or 4.8(EUS-to-EUS upgrade).

So it's better to add a check in ansible code before we start the upgrade steps against RHEL-7 workers.


[1] https://docs.openshift.com/container-platform/4.9/updating/updating-cluster-rhel-compute.html#rhel-compute-updating-minor_updating-cluster-rhel-compute


What did you expect to happen?
upgrade playbook should abort directly when detecting it's upgrading a RHEL-7 worker, and prompt meaningful message like:
RHEL-7 worker are NOT supported in 4.10, you must deploy new RHEL-8 or RHCOS workers, the old RHEL 7 hosts should be removed.

Comment 1 Patrick Dillon 2022-01-13 18:57:19 UTC
This is a good suggestion and we should incorporate it, but trying to determine if this is a blocker:

- If we don't make this change it would allow someone to upgrade into an unsupported state, but it is also documented that it is unsupported.
- openshift-ansible is not shipped as part of the release image, so how does that relate to the entire blocker+/- system?

Comment 2 Matthew Staebler 2022-01-13 21:48:22 UTC
Given that openshift-ansible is not part of the release, this by definition cannot be a blocker.

Comment 3 Matthew Staebler 2022-01-13 21:51:18 UTC
(In reply to Matthew Staebler from comment #2)
> Given that openshift-ansible is not part of the release, this by definition
> cannot be a blocker.

Actually, that is not quite right. I could see a scenario where something outside of the release is required to be in place or fixed prior to the release. I could also see where the openshift-ansible playbooks would fit that description.

Nevertheless, I do not think this is meets the bar for blocking the 4.10 release.

Comment 4 Scott Dodson 2022-01-14 15:48:38 UTC
In case my opinion matters, I agree that this should not be considered a blocker.

Comment 7 Gaoyun Pei 2022-02-12 06:17:21 UTC
Verified this bug with openshift-ansible-4.11.0-202202111945.p0.g4f59ed0.assembly.stream.el7.noarch.rpm 

For RHEL-7 host, when trying to run playbooks/upgrade.yml or playbooks/scaleup.yml, it would fail with the following error.


TASK [openshift_node : Set fact l_cluster_version] *****************************
Saturday 12 February 2022  13:39:22 +0800 (0:00:00.693)       0:00:02.027 ***** 
ok: [ip-10-0-51-150.us-east-2.compute.internal] => {"ansible_facts": {"l_cluster_version": "4.11"}, "changed": false}

TASK [openshift_node : Fail if not using RHEL8 beginning with version 4.10] ****
Saturday 12 February 2022  13:39:22 +0800 (0:00:00.088)       0:00:02.116 ***** 
fatal: [ip-10-0-51-150.us-east-2.compute.internal]: FAILED! => {"changed": false, "msg": "As of v4.10, RHEL nodes must be at least version 8.4"}

Comment 12 errata-xmlrpc 2022-08-10 10:41:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.