Description of problem: When attempting to install an OCP 4.4 on Z zVM based cluster using the 05-18-2020 or 05-25-2020 OCP 4.4 nightly builds, a serious installation issue occurs when the zVM worker nodes are xautolog booted from their zVM reader kernel.img file within a few seconds after the master nodes are xautolog booted from their corresponding zVM reader kernel.img file. The worker nodes will not properly integrate into the OCP 4.4 cluster, as there tend to be pending CSRs and/or "Internal Server Error" issues from the bastion server GET operations (which can even loop infinitely). Even if these CSRs are subsequently approved, the ensuing worker node cluster integration is not completed, or not completed properly. 1. The rhcos-44.81.202005180840-0-installer-kernel-s390x and rhcos-44.81.202005180840-0-installer-initramfs.s390x.img components are used. 2. Our team has encountered this issue multiple times and is easily reproducible using multiple zVM environments (and a KVM POC environment) on System z. We are testing using an IBM System z z15 server. 3. When this sequence of bootstrap, master, and then worker node zVM xautolog reader file boots are performed with only a few seconds between the master and worker nodes, the worker nodes will not properly integrate into the OCP 4.4 cluster, or sometimes not integrate at all without manual intervention. One condition encountered with the worker nodes is an infinite loop when attempting a GET from the repo server. 4. With OCP 4.2 and 4.3, including OCP 4.3.21, using the same zVM environments, the zVM xautolog sequence of booting the bootstrap, master, and worker nodes in quick succession does not cause an issue with the cluster installation, with all master and worker nodes successfully joining the cluster, and all cluster operators are successfully started with AVAILABLE status of "True". 5. With OCP 4.4 (unless a potential workaround is employed), this xautolog sequence of booting the bootstrap, master, and worker nodes in quick succession causes installation issues that can be apparent at install time, or later when running workload. Specifically, when one of both worker nodes in a two worker node configuration do not properly join the cluster (i.e., the oc get nodes command only shows one worker node with a STATUS of "Ready"), the authentication and console cluster operators may not achieve AVAILABLE status of "True". The authentication cluster operator AVAILABLE status may be at "Unknown" forever. The console cluster operator AVAILABLE status may remain at "False" forever. 6. With OCP 4.4, one workaround that we have had relatively good success with has been to xautolog boot the bootstrap and master nodes in quick succession, and then wait approximately 500 seconds before continuing with the xautolog of the worker nodes. 7. This worker node workaround seems to be necessary as the 3 master nodes do not achieve a STATUS of "Ready" for close to 8 minutes (or 480 seconds). Although this STATUS of "Ready" for the 3 master nodes is not necessary for OCP 4.3 for the worker nodes' to begin integration into the cluster, it does appear to be a prerequisite/issue with OCP 4.4. The underlying issue with the master nodes seems to be the etcd readiness state for which to continue with proper cluster integration of the worker nodes. 8. Testing this OCP 4.4 cluster installation workaround for 17+ installations, for all but possibly 1-2 tests, all cluster operators have almost consistently achieve AVAILABLE state of "True" and remained in this state after running workload. Thank you. Version-Release number of selected component (if applicable): OCP 4.4 44.81.202005180840-0 How reproducible: Can be reproduced consistently Steps to Reproduce: Please see above information in "Description of the Problem" section. 1. 2. 3. Actual results: OCP 4.4 zVM cluster does not install properly when xautolog booting kernel img Expected results: OCP 4.4 zVM custer installs properly, using the same method as successful with OCP 4.3. Additional info:
Moving this to the Multi-Arch team for further investigation. "Build" is reserved for the OpenShift image build APIs and underlying components. See https://docs.openshift.com/container-platform/4.4/builds/understanding-image-builds.html
I can confirm that the workers are not added automatically. After approving the pending CSRs manually the worker nodes were added to the cluster but it took another 32 minutes until the workers became ready. With the workers being in the ready state it took another 12 minutes until all cluster operators were available. The duration of the installation was 1h 7 min. system: z13, zVM, 3 masters, 3 workers, fcp, hipersockets, nodes were ipled in quick succession. Client Version: 4.4.0-0.nightly-s390x-2020-06-01-021037 Server Version: 4.4.0-0.nightly-s390x-2020-06-01-021037 Kubernetes Version: v1.17.1+f5fb168 Kernel Version: 4.18.0-147.8.1.el8_1.s390x OS Image: Red Hat Enterprise Linux CoreOS 44.81.202005250840-0 (Ootpa) Operating System: linux Architecture: s390x Container Runtime Version: cri-o://1.17.4-12.dev.rhaos4.4.git2be4d9c.el8
Re-assigning this to Andy.
An `oc adm must-gather` would be helpful
Requested "oc adm must-gather" tar file from OCP 4.4 cluster is 322MB (exceeding the 19.5MB limit). Would an email attachment to your email account be acceptable? Here is some basic information from one of the clusters we are seeing the issue with: [root@OSPBMGR1 ~]# oc version Client Version: 4.4.0-0.nightly-s390x-2020-05-25-145353 Server Version: 4.4.0-0.nightly-s390x-2020-05-25-145353 Kubernetes Version: v1.17.1 [root@OSPBMGR1 ~]# [root@OSPBMGR1 ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-0.nightly-s390x-2020-05-25-145353 True False 20h Cluster version is 4.4.0-0.nightly-s390x-2020-05-25-145353 [root@OSPBMGR1 ~]# [root@OSPBMGR1 ~]# oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}' ac97dfd6-d9d8-48ba-b1a8-80ce7b76226a [root@OSPBMGR1 ~]# Thank you.
could you try removing the audit_logs from the must gather and then try to do a .tar.gz and check the size ?
Yes, removed the audit_logs directory and this reduces the size of the must gather tar file to approximately 139MB. Here are the approximate sizes of the remaining must gather component directories: [root@OSPBMGR1 quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-b1d72669d15ae0d9727a9201c225f458b9d6ce09a6d67a0edc43469774d74cbb]# du -ms * 4 cluster-scoped-resources 1015 host_service_logs 242 namespaces 1 version [root@OSPBMGR1 quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-b1d72669d15ae0d9727a9201c225f458b9d6ce09a6d67a0edc43469774d74cbb]# Thank you.
ok.Can you just attach the kubelet logs from the workers that had trouble joining?
Created attachment 1695510 [details] worker-0 kubelet log part 1 of 2
Created attachment 1695511 [details] worker-0 kubelet log part 2 of 2
Created attachment 1695512 [details] worker-1 kubelet log part 1 of 3
Created attachment 1695513 [details] worker-1 kubelet log part 2 of 3
Created attachment 1695514 [details] worker-1 kubelet log part 3 of 3
Prashanth, Thanks. Given their sizes, I needed to divide the worker-0 and worker-1 kubelet log files into 2 and 3 tar.gz parts, respectively. Please let us know if you need any additional information. Thank you, Kyle
Ok. On a first glance , it looks like the kubelet was able to register the node early on : Jun 03 16:21:35 worker-1.pok-43-may27.pok.stglabs.ibm.com hyperkube[1395]: I0603 16:21:35.287828 1395 kubelet_node_status.go:73] Successfully registered node worker-1.pok-43-may27.pok.stglabs.ibm.com Jun 03 16:22:15 worker-1.pok-43-may27.pok.stglabs.ibm.com hyperkube[1395]: I0603 16:22:15.368873 1395 kubelet_node_status.go:486] Recording NodeReady event message for node worker-1.pok-43-may27.pok.stglabs.ibm.com But looks like an hour and a half later the node was rebooted: Jun 03 17:48:15 worker-1.pok-43-may27.pok.stglabs.ibm.com systemd[1]: Stopped Kubernetes Kubelet. Jun 03 17:48:15 worker-1.pok-43-may27.pok.stglabs.ibm.com systemd[1]: kubelet.service: Consumed 4min 31.230s CPU time -- Reboot -- Jun 03 17:48:36 worker-1.pok-43-may27.pok.stglabs.ibm.com systemd[1]: Starting Kubernetes Kubelet... And then there were some erros probably because the apiserver was down: Jun 03 17:48:57 worker-1.pok-43-may27.pok.stglabs.ibm.com hyperkube[1391]: E0603 17:48:57.952655 1391 kubelet_node_status.go:92] Unable to register node "worker-1.pok-43-may27.pok.stglabs.ibm.com" with API server: Post https://api-int.pok-43-may27.pok.stglabs.ibm.com:6443/api/v1/nodes: EOF And then 3 minutes later it succeeds: Jun 03 17:51:02 worker-1.pok-43-may27.pok.stglabs.ibm.com hyperkube[1391]: I0603 17:51:02.841651 1391 kubelet_node_status.go:73] Successfully registered node worker-1.pok-43-may27.pok.stglabs.ibm.com Was the cluster rebooted after a while for a particular reason ? the kubelet logs above say the node registration succeeded at 16:21:35 which is pretty early on. Did you see the node when you did a `oc get nodes` ?
Prashanth, Thanks for your analysis and my apologies for any misunderstanding. Here's some additional background information on the OCP 4.4 zVM install issues we are experiencing, and how this relates to the data submitted to this bug. There are at least 3 different OCP 4.4 zVM install process scenarios with issues that are not present with the OCP 4.3 zVM install process. In these scenarios we used a minimal configuration of 3 master nodes and 2 worker nodes. Similar issues/conditions occur when using additional worker nodes. When installing OCP 4.4, after the 3 masters are booted/installed, dependent when the 2 worker nodes are booted after the 3 master nodes, these 3 different scenarios can occur. The issue seems to be the readiness of the master nodes to then integrate the worker nodes (possibly etcd). With OCP 4.4, the master nodes seem to require approximately 8 minutes from boot/install until they achieve "Ready" state (as reported by the "oc get nodes" command). 1. Boot/install the 3 master nodes and the 2 worker nodes in quick succession, as is the normal install procedure with OCP 4.3. =============================================================================================================================== Unless the worker nodes' pending CSRs are approved within a few minutes of the master nodes' achieving their "Ready" state (as reported by the "oc get nodes" command), the console and authentication operators do not ever achieve "AVAILABLE" state of True (as reported by the "oc get co" command). Each worker node has 1 or more pending CSRs, with 1 each prior to joining the cluster, and may have at least 1 pending CSR each after joining the cluster. 2. Boot/install the 3 master nodes first, wait approximately 500 seconds, and then boot/install the 2 worker nodes. =================================================================================================================== This OCP 4.4 zVM install process seems to consistently work where the master nodes successfully install, both workers install and join the cluster, all cluster operators achieve "AVAILABLE " states of True (as reported by the "oc get co" command), and no pending CSRs. 3. Boot/install the 3 master nodes and 1 worker node, wait approximately 500 seconds, and then boot/install the 2nd worker node. ================================================================================================================================ The first worker node does not integrate into the cluster without first manually clearing its pending CSR(s), while the second worker node integrates into the cluster without manual intervention, without pending CSRs, and without issues. As there is at least one "Ready" state worker node in the cluster without intervention (as reported by the "oc get nodes" command), all cluster operators will achieve "AVAILABLE" states of True (as reported by the "oc get co" command) within normal time frames. The worker kubelet data I previously submitted was for install scenario #2 above, with both worker nodes integrating into the cluster without intervention. However, these 2 worker nodes would not have successfully integrated if they had not waited to boot/install for approximately 500 seconds after the master nodes boot/install with their subsequent "Ready" states achieved. We're curious as to this possible root cause "readiness" condition on the master nodes. We manually (and intentionally) rebooted the cluster after approximately 90 minutes as part of debug and error recovery testing. For OCP 4.4 zVM install scenarios #1 and #3 listed above, we can recreate and provide any master and/or worker kublelet data that you would like. Thank you, Kyle
Thank you Kyle for this detailed explanation! Certainly makes things clearer. So if i understand correctly there are two issues: - master nodes take 8 mins to get to ready state - from the point when they show up through `oc get nodes` ? - Worker nodes are not automatically joined unless csr is approved? The second one might be the default behavior because we are also seeing that the csrs do not get approved automatically. there is a manual step required to approve them. After that things seem fine though. This might be how it is designed to work as the docs say you need to approve the csrs if they are not approved: https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-bare-metal.html#installation-approve-csrs_installing-bare-metal. But i will follow up on this to confirm. As for the issue of the master nodes not reaching ready state within 8 minutes that is weird. In our zVM setups we see all masters reach ready state in less than a minute or two from the time they are up. Could you provide us with the bootkube logs from the bootstrap node? Is there any slowness in the network as such which could lead to this ? I am also assuming masters have 16G or more of memory.
Prashanth, Thank you for your assistance. Here is some information to help with your questions. 1. The master nodes require approximately 8 minutes from zVM xautolog boot from their zVM readers until they attain STATUS "Ready" as reported by the "oc get nodes" command. Using the "oc get nodes" command, the 3 master nodes actually start to display with STATUS "NotReady" approximately 3 minutes after they perform their zVM xautolog boot from their zVM readers. The 3 master nodes then require approximately 5 additional minutes before they transition from STATUS "NotReady" to "Ready". 2. The worker nodes can automatically join the cluster if their zVM xautolog boot from their zVM readers is timed to occur within a short time after the 3 master nodes attain STATUS "Ready" as reported by the "oc get nodes" command. If this window is missed, then the pending CSRs for these worker nodes must be approved within 1-2 minutes. After this 1-2 minute window for the pending worker nodes CSRs' approval, if they are approved after, the authentication console operator may not achieve AVAILABLE status of True, as indicated by the "oc get co" command. 3. All nodes in a few OCP 4.4 clusters we have seen this issue with have 32GB real memory, including the bastion, bootstrap, master, and worker nodes. 4. This OCP 4.4 install issue does not seem to be a network issue as: 1. Both the IBM Germany and USA Solution Test teams see the same/similar behavior in their separate lab environments. 2. Installing OCP 4.3 with the 4.3.23 build on the same clusters does not have these issues, with no pending CSRs, or master readiness issues of 8 minutes. 5. Yes, will work to provide the bootkube logs from the bootstrap node. Thank you, Kyle
Thanks again for the succinct explanation again. While we are waiting for the bootstrap logs - could also check if your masters are scheduled as workers too ? i.e, [root@rock-zvm-3-1 ~]# ./oc get nodes NAME STATUS ROLES AGE VERSION master-0.test.example.com Ready master,worker 2m57s v1.18.3+1635e9d master-1.test.example.com Ready master,worker 3m15s v1.18.3+1635e9d master-2.test.example.com Ready master,worker 2m52s v1.18.3+1635e9d If this happens, can you make sure that when the masters are visible through `oc get nodes` (even if NotReady) , could you set mastersschedulable to false like this: oc patch schedulers.config.openshift.io cluster --type merge --patch '{"spec":{"mastersSchedulable": false}}' We believe this could be causing the authentication operator to have problems when the worker nodes come up. Could you also send the kubelet log for one of the masters along with the bootkube logs? Thanks
Prashanth, Thanks for the information. I've reinstalled a few times to test some scenarios and the mastersschedulable to false, and will work to upload the kubelet log for one of the masters along with the bootkube logs today. 1. For the 4.4.0-0.nightly-s390x-2020-05-25-145353 build I've been testing with, all 3 master nodes are also scheduled as workers. [root@ospbmgr2 /]# oc get nodes NAME STATUS ROLES AGE VERSION master-0.pok-90-may25.pok.stglabs.ibm.com Ready master,worker 132m v1.17.1 master-1.pok-90-may25.pok.stglabs.ibm.com Ready master,worker 131m v1.17.1 master-2.pok-90-may25.pok.stglabs.ibm.com Ready master,worker 131m v1.17.1 worker-0.pok-90-may25.pok.stglabs.ibm.com Ready worker 11m v1.17.1 worker-1.pok-90-may25.pok.stglabs.ibm.com Ready worker 126m v1.17.1 [root@ospbmgr2 /]# 2. One key difference between your and our configuration seems to be the Kubernetes version. Your kubernetes version is v1.18.3+1635e9d, while our kubernetes version is v1.17.1. [root@ospbmgr2 /]# oc version Client Version: 4.4.0-0.nightly-s390x-2020-05-25-145353 Server Version: 4.4.0-0.nightly-s390x-2020-05-25-145353 Kubernetes Version: v1.17.1 [root@ospbmgr2 /]# The release.txt files for the 05-18-2020, 05-25-2020, 06-01-2020, and 06-08-2020 release.txt files all indicate: Kubernetes 1.17.1 3. Setting the mastersschedulable value to false seems to resolve the authentication operator issue for my initial tests, and I'll be conducting a few more tests today. Thank you, Kyle
Thank you Kyle! Please don't worry about the kubernetes version as I was testing with some 4.6 builds at that time .The kubernetes version of 4.4 is v1.17.1 as you pointed out. Setting the mastersSchedulable to false is a recommended step and would have caused race conditions in where the ingress pods were placed which might have caused the issue when the masters were set as schedulable. Please make sure that is part of your installation steps.
Created attachment 1696923 [details] master-0 kubelet log part 1 of 5
Created attachment 1696924 [details] master-0 kubelet log part 2 of 5
Created attachment 1696925 [details] master-0 kubelet log part 3 of 5
Created attachment 1696926 [details] master-0 kubelet log part 4 of 5
Created attachment 1696927 [details] master-0 kubelet log part 5 of 5
Created attachment 1696928 [details] master-0 kubelet log part 2 of 5
Created attachment 1696929 [details] bootstrap-0 bootkube service log
Created attachment 1696930 [details] bootstrap-0 bootstrap-control-plane logs
Prashanth, Thanks for the information. I've uploaded the requested bootstrap-0 and master-0 logs. 1. My apologies for not getting these logs to you yesterday -- two colleagues and I were trying to install the OCP 4.4 nightly 06-08-2020 build on multiple clusters and were unsuccessful, where only the bootstrap-0 node installs successfully. Would you know if anyone on your team has successfully installed the OCP 4.4 nightly 06-08-2020 build? 2. Given the master-0 node kubelet log size, I needed to split into 5 tar.gz parts. Please let us know if you need any additional information. Thank you, Kyle
Prashanth, Just a quick update that as we continue to test with the June 12, 2020 (06-08-2020) OCP 4.4 nightly build, starting on Friday and over the weekend, our initial results seem to indicate that most of the OCP 4.4 install problems we have encountered to date have now been corrected with this build. As an example, instead of requiring several minutes, the 3 masters now achieve "Ready" STATUS (as reported by the "oc get nodes" command) within 60-65 seconds. We'll provide additional information/updates on Monday. Thank you, Kyle
Prashanth, We're continuing to test with the June 12, 2020 (06-08-2020) OCP 4.4 nightly build, and the build consistently installs without needing to sleep between the master node and worker node boots, as long as the pending worker nodes' CSRs are accepted, and the mastersSchedulable option is set to false. There does seem to be a longer delay between when the master nodes become Ready and the worker nodes install and then become Ready. We are continuing to investigate. Thank you, Kyle
That's great news Kyle. Please keep us updated and let me know if this issue can be closed once you confirm.
Prashanth, We've been continuing to test with the June 12, 2020 OCP 4.4 build (06-12-2020, please forgive my typo in my previous post where I indicated 06-08-2020) and our zVM installation tests have been consistently successful, given we need to approve pending worker node CSRs and set the mastersschedulable to false. Our subsequent tests of the 06-14-2020, 06-15-2020, and 06-17-2020 builds indicate basic install issues (where the master nodes do not install consistently) and we're continuing to work on the debug effort. We encountered similar master install issues with the 06-01-2020 and 06-08-2020 builds. At this point, it seems the most stable OCP 4.4 build to date is the June 12, 2020 06-12-2020 build, which also does not require the approximate 7-8 minute sleep between the boot/install of the master nodes and then the boot/install of the worker nodes. We encountered the issue where the worker nodes' boot/install must wait 7-8 minutes after the master nodes' boot/install with the 05-18-2020 and 05-25-2020 OCP 4.4 builds. Thank you, Kyle
Kyle, I just tested 06-18-2020 build and saw no issues. I approved the csrs and the workers got added and the cluster came up fine with all operators running. As before please provide bootkube and kubelet logs from the bootstrap and the problematic nodes. I believe the issue with the worker nodes having to wait 7-8 minutes is resolved with the newer builds and the issue you are seeing now is something new? When you say master nodes do not install - do they never reach the ready state? do they not boot up ? some details would be good. Prashanth
Prashanth, Thanks for the update. We just tested with the 06-18-2020 build, which consists of: 1. rhcos-44.81.202006171550-0-dasd.s390x.raw.gz 2. rhcos-44.81.202006171550-0-installer-initramfs.s390x.img 3. rhcos-44.81.202006171550-0-installer-kernel-s390x 4. rhcos-44.81.202006171550-0-installer.s390x.iso 5. rhcos-44.81.202006171550-0-metal.s390x.raw.gz 6. openshift-client-linux-4.4.0-0.nightly-s390x-2020-06-18-112815.tar.gz 7. openshift-install-linux-4.4.0-0.nightly-s390x-2020-06-18-112815.tar.gz and it installs without issue. When trying to install with the latest 06-17-2020 openshift-client and openshift-install, the install fails with continual master and worker node http GET requests that are never fulfilled. Thank you, Kyle
Prashanth, When running the openshift-install 4.4.0-0.nightly-s390x-2020-06-18-112815 and client Version: 4.4.0-0.nightly-s390x-2020-06-18-112815, we are getting on our bootstrap node the following from journalctl -xe entries continually: Jun 18 17:19:59 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1> Jun 18 17:19:59 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1620bc3> Jun 18 17:20:00 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1> Jun 18 17:20:00 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1620bc3> Jun 18 17:20:00 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1> Jun 18 17:20:00 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1688]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:1620bc3> lines 969-1013/1013 (END) Thank you, Kyle
I noticed that the image pull from the 6-12 installer is trying to get to release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:139c0faf3422db2d106ff2d1f5fd44cc06f69a6a772ca7083adcf460a3e88c45 The 6-17 installer is trying release image registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:ac208c137611e808414b0a9b6321f5983ce8933c6e17d50521b24cabc7ef2c78 I am not sure if this is the issue but it is the only real difference I see at this point. I did the check after seeing this from journalctl -xe Jun 18 17:21:09 bootstrap-0.ospamgr2-jun17.zvmocp.notld release-image-download.sh[1718]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp..... Thanks, Christian
Prashanth, We get the same issue when using Client Version: 4.4.0-0.nightly-s390x-2020-06-17-185805, openshift-install 4.4.0-0.nightly-s390x-2020-06-17-185805, bootstrap rhcos level: Red Hat Enterprise Linux CoreOS 44.81.202006171550-0 (Ootpa) 4.4 We also see this intermittently on the bootstrap: [systemd] Failed Units: 1 sssd.service We are currently seeing the following on the bootstrap: Jun 18 17:48:26 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:a> Jun 18 17:48:26 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:ac208c1> Jun 18 17:48:26 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:a> Jun 18 17:48:26 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:ac208c1> Jun 18 17:48:27 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Error: error pulling image "registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:a> Jun 18 17:48:27 bootstrap-0.pok-90-jun15.pok.stglabs.ibm.com release-image-download.sh[1717]: Pull failed. Retrying registry.svc.ci.openshift.org/ocp-s390x/release-s390x@sha256:ac208c1> [root@bootstrap-0 core]# oc version Client Version: 4.4.0-202006132207-d038424 [root@bootstrap-0 core]# Thank you, Kyle
Prashanth, Thank you again to your colleagues and you for all the assistance with the dual build stream issue. We have been able to successfully install multiple zVM clusters with the OCP 4.4.0-0.nightly-s390x-2020-06-17-185805 build without issue. We are continuing to test and will let you know if any issues. Thank you, Kyle
Hi Kyle, can this bug be closed or is there additional work to be done? Prashanth is OOTO at the moment so any work to be done will likely be deferred until the next sprint.
Thank you for all your assistance. Please close this issue.
Closing per confirmation from reporter