Description of problem: 1. When attempting to install a z/VM hosted OCP 4.7.24 on Z cluster, using the 4.7.0-0.nightly-s390x-2021-08-06-180814 build and the accompanying RHCOS 47.84.202108052001-0 build for the bootstrap node, the master/control nodes do not acquire their private network interface and the installation process does not continue, with the OCP cluster installation ultimately failing. 2. The master/control nodes do not acquire their network interface and hostname, with the master/control nodes booted but not network accessible, and with hostname "localhost". 3. For the OCP 4.7.23 on Z build, the corresponding required RHCOS build version is 47.83.202107271611-0, which is built on RHEL 8.3. This consistently installs the OCP 4.7.23 on Z z/VM hosted cluster without issue. 4. For the OCP 4.7.24 on Z build, the corresponding required RHCOS build version is 47.84.202108052001-0, which is built on RHEL 8.4. This consistently fails to install the OCP 4.7.24 on Z z/VM hosted cluster. 5. The OCP 4.7.24 on Z build is the first OCP 4.7.z build that includes an RHCOS 47 build version built on RHEL 8.4. 6. A workaround for this RHCOS 47.84 build install issue is to use the RHCOS 47.83 build 47.83.202107271611-0 for the bootstrap node install, after which all of the master/control nodes and worker/compute nodes will successfully install, including all network configuration. These master/control nodes and worker/compute nodes will successfully install the required RHCOS 47.84.202108052001-0 build as part of the install process. 7. This OCP 4.7.24 on Z installation issue with the introduction of RHCOS 47.84 is very similar to the issue found with the initial introduction of RHCOS 48.84 for OCP 4.8 and as documented in Red Hat OpenShift bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1950974 Version-Release number of selected component (if applicable): OCP 4.7.24 (4.7.0-0.nightly-s390x-2021-08-06-180814) How reproducible: Consistently reproducible Steps to Reproduce: 1. Initiate installation of OCP 4.7.24 on Z cluster using the OCP 4.7.0-0.nightly-s390x-2021-08-06-180814 build and the corresponding RHCOS 47.84.202108052001-0 build. Actual results: The z/VM hosted OCP 4.7.24 on Z install process will fail as the master/control nodes do not configure their network interfaces. Expected results: The z/VM hosted OCP 4.7.24 on Z install process should succeed with the master/control nodes properly configuring their network interfaces. Additional info: Thank you.
Yes. Because 4.7 moved to RHEL8.4, it pulled in the new systemd which has a bug because of which the system-connnections-merged directory does not mount: https://bugzilla.redhat.com/show_bug.cgi?id=1952686. Looks like https://github.com/openshift/machine-config-operator/pull/2543 needs to be backported to 4.7. Kyle, Just to confirm, can you check the status of this systemd unit? "systemctl status etc-NetworkManager-systems\x2dconnections\x2dmerged.mount" Thanks Prashanth
Re-assigning to Prashanth as he has worked on similar bug before. The bug triage team is setting "Blocker+" as the bug blocks the build. Also adding "reviewed-in-sprint" as this is a new bug and may take time to fix.
*** Bug 1992676 has been marked as a duplicate of this bug. ***
Folks, The same or similar network configuration related issue(s) for the master/control plane nodes appears to be present with the RHCOS 47.84.202108161003-0 build specified for the OCP 4.7.25 4.7.0-0.nightly-s390x-2021-08-16-204650 build. Thank you, Kyle
You're not using the bootimages defined in the installer which is not a supported installation path. We should fix this bug in case we ever bump the boot images but dropping blocker+.
Hi Kyle, Are you getting the RHCOS bootimages from https://releases-rhcos-art.cloud.privileged.psi.redhat.com/ ? The bootimages you should be using need to be aligned with the ones here: https://github.com/openshift/installer/blob/release-4.7/data/data/rhcos-s390x.json as the installer is the definitive source of bootimages. These images are available here: https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.7/latest/ and should be the ones used which is how the customers would install. Thanks Prashanth
*** Bug 1992245 has been marked as a duplicate of this bug. ***
Hi Kyle, The latest nightlies have the fix: https://openshift-release-s390x.apps.ci.l2s4.p1.openshiftapps.com/#4.7.0-0.nightly-s390x . Could you please test it and confirm that the problem is fixed? Thanks Prashanth
Prashanth, Thanks for the update. 1. We've successfully performed multiple zVM hosted install tests with OCP 4.9 nightly build 4.7.0-0.nightly-s390x-2021-08-21-044859, with RHCOS 47.84.202108181404-0 build for the bootstrap 2. We're in the process of performing some additional zVM hosted tests with OCP 4.9 nightly build 4.7.0-0.nightly-s390x-2021-08-25-185227 and RHCOS 47.84.202108251004-0 for the bootstrap, and will provide a follow-on update here. Thank you, Kyle
(In reply to Prashanth Sundararaman from comment #6) > Hi Kyle, > > Are you getting the RHCOS bootimages from > https://releases-rhcos-art.cloud.privileged.psi.redhat.com/ ? The bootimages > you should be using need to be aligned with the ones here: > https://github.com/openshift/installer/blob/release-4.7/data/data/rhcos- > s390x.json as the installer is the definitive source of bootimages. > > These images are available here: > https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.7/ > latest/ and should be the ones used which is how the customers would install. > > Thanks > Prashanth Prashanth, Thanks for the information. For due diligence purposes, including to help find RHCOS build issues before an RHCOS build they may potentially be elevated/promoted to the latest customer available RHCOS build (for example at https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/4.7/latest/), we test with both this latest customer available RHCOS build and the RHCOS build documented in the OCP 4.x vuild's release.txt file. Thank you, Kyle
Kyle(In reply to krmoser from comment #11) > Prashanth, > > Thanks for the update. > > 1. We've successfully performed multiple zVM hosted install tests with OCP > 4.9 nightly build 4.7.0-0.nightly-s390x-2021-08-21-044859, with RHCOS > 47.84.202108181404-0 build for the bootstrap > > 2. We're in the process of performing some additional zVM hosted tests with > OCP 4.9 nightly build 4.7.0-0.nightly-s390x-2021-08-25-185227 and RHCOS > 47.84.202108251004-0 for the bootstrap, and will provide a follow-on update > here. > > Thank you, > Kyle My apologies for the typos in comment 11 above where I indicated "OCP 4.9" which I meant to indicate "OCP 4.7". Thank you.
Prashanth, We've successfully installed the following OCP 4.7 nightly builds in z/VM hosted environments, with the listed RHCOS latest and 47.84 builds for the bootstrap node. 1. OCP 4.7 nightly build 4.7.0-0.nightly-s390x-2021-08-21-044859 ================================================================ 1. RHCOS 4.7.13 (latest) 2. RHCOS 47.84.202108181404-0 3. RHCOS 47.84.202108251004-0 2. OCP 4.7 nightly build 4.7.0-0.nightly-s390x-2021-08-25-185227 ================================================================ 1. RHCOS 4.7.13 (latest) 2. RHCOS 47.84.202108181404-0 3. RHCOS 47.84.202108251004-0 Thank you, Kyle
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.7.28 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3262