Description of problem: OCP 4.11 on Z build 4.11.0-0.nightly-s390x-2022-06-16-003753 does not install in zVM environments (and potentially KVM). The installation does not proceed past attempting to install the network cluster operator. The 3 master nodes remain in the "NotReady" state. For zVM environments: =============================================================================================================================================== Installs will fail with the network operator failing to complete installation with the following "oc get clusterversion", "oc get nodes", and "oc get co" output, as a repeatable example: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 13m Unable to apply 4.11.0-0.nightly-s390x-2022-06-16-003753: an unknown error has occurred: MultipleErrors NAME STATUS ROLES AGE VERSION master-0.pok-96.ocptest.pok.stglabs.ibm.com NotReady master 13m v1.24.0+25f9057 master-1.pok-96.ocptest.pok.stglabs.ibm.com NotReady master 13m v1.24.0+25f9057 master-2.pok-96.ocptest.pok.stglabs.ibm.com NotReady master 13m v1.24.0+25f9057 NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication baremetal cloud-controller-manager 4.11.0-0.nightly-s390x-2022-06-16-003753 True False False 12m cloud-credential True False False 12m cluster-autoscaler config-operator console csi-snapshot-controller dns etcd image-registry ingress insights kube-apiserver kube-controller-manager kube-scheduler kube-storage-version-migrator machine-api machine-approver machine-config marketplace monitoring network False True True 12m The network is starting up node-tuning openshift-apiserver openshift-controller-manager openshift-samples operator-lifecycle-manager operator-lifecycle-manager-catalog operator-lifecycle-manager-packageserver service-ca storage Version-Release number of selected component (if applicable): OCP 4.11 on Z build 4.11.0-0.nightly-s390x-2022-06-16-003753 How reproducible: Consistently reproducible. Steps to Reproduce: 1. Attempt to install OCP on Z build 4.11.0-0.nightly-s390x-2022-06-16-003753 in a zVM environment. Actual results: The OCP 4.11 on Z 4.11.0-0.nightly-s390x-2022-06-16-003753 cluster will fail to install, with the etcd and network cluster operators failing to complete installation. Expected results: The OCP 4.11 on Z 4.11.0-0.nightly-s390x-2022-06-16-003753 cluster should consistently successfully install. Additional info: Will provide the partial results of a must-gather a bit later this morning. Thank you.
Here is the output when attempting to collect an "oc adm must-gather": [root@ospbmgr4 ~]# oc adm must-gather [must-gather ] OUT the server could not find the requested resource (get imagestreams.image.openshift.io must-gather) [must-gather ] OUT [must-gather ] OUT Using must-gather plug-in image: registry.redhat.io/openshift4/ose-must-gather:latest When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information: ClusterID: c7c6149b-cbc7-4721-843c-4083934c88dc ClusterVersion: Installing "4.11.0-0.nightly-s390x-2022-06-16-003753" for 3 hours: Working towards 4.11.0-0.nightly-s390x-2022-06-16-003753: 647 of 802 done (80% complete) ClusterOperators: clusteroperator/authentication is not available (<missing>) because <missing> clusteroperator/baremetal is not available (<missing>) because <missing> clusteroperator/cluster-autoscaler is not available (<missing>) because <missing> clusteroperator/config-operator is not available (<missing>) because <missing> clusteroperator/console is not available (<missing>) because <missing> clusteroperator/csi-snapshot-controller is not available (<missing>) because <missing> clusteroperator/dns is not available (<missing>) because <missing> clusteroperator/etcd is not available (<missing>) because <missing> clusteroperator/image-registry is not available (<missing>) because <missing> clusteroperator/ingress is not available (<missing>) because <missing> clusteroperator/insights is not available (<missing>) because <missing> clusteroperator/kube-apiserver is not available (<missing>) because <missing> clusteroperator/kube-controller-manager is not available (<missing>) because <missing> clusteroperator/kube-scheduler is not available (<missing>) because <missing> clusteroperator/kube-storage-version-migrator is not available (<missing>) because <missing> clusteroperator/machine-api is not available (<missing>) because <missing> clusteroperator/machine-approver is not available (<missing>) because <missing> clusteroperator/machine-config is not available (<missing>) because <missing> clusteroperator/marketplace is not available (<missing>) because <missing> clusteroperator/monitoring is not available (<missing>) because <missing> clusteroperator/network is not available (The network is starting up) because DaemonSet "/openshift-ovn-kubernetes/ovn-ipsec" rollout is not making progress - last change 2022-06-17T10:31:45Z DaemonSet "/openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-7bbwg is in CrashLoopBackOff State DaemonSet "/openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-mft6s is in CrashLoopBackOff State DaemonSet "/openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-nwbz5 is in CrashLoopBackOff State DaemonSet "/openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - last change 2022-06-17T10:31:46Z DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-06-17T10:31:46Z clusteroperator/node-tuning is not available (<missing>) because <missing> clusteroperator/openshift-apiserver is not available (<missing>) because <missing> clusteroperator/openshift-controller-manager is not available (<missing>) because <missing> clusteroperator/openshift-samples is not available (<missing>) because <missing> clusteroperator/operator-lifecycle-manager is not available (<missing>) because <missing> clusteroperator/operator-lifecycle-manager-catalog is not available (<missing>) because <missing> clusteroperator/operator-lifecycle-manager-packageserver is not available (<missing>) because <missing> clusteroperator/service-ca is not available (<missing>) because <missing> clusteroperator/storage is not available (<missing>) because <missing> [must-gather ] OUT namespace/openshift-must-gather-mp78s created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-2npds created Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (containers "gather", "copy" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "gather", "copy" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or containers "gather", "copy" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "gather", "copy" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") [must-gather ] OUT pod for plug-in image registry.redhat.io/openshift4/ose-must-gather:latest created ^C
Created attachment 1890914 [details] master-0 node journalctl log master-0 node journalctl log
same as: https://bugzilla.redhat.com/show_bug.cgi?id=2098151 All nightly payloads are experiencing this issue and a fix is being worked on.
Going to go ahead and close this as a duplicate. Based on the info in https://bugzilla.redhat.com/show_bug.cgi?id=2098123#c3. *** This bug has been marked as a duplicate of bug 2098151 ***
FYI. The OCP on Z Solution Test team has confirmed the same install issue exists with KVM environments. Thank you.