|Summary:||Incorrect installation of ibmcloud vpc csi driver in IBM Cloud ROKS 4.10|
|Product:||OpenShift Container Platform||Reporter:||Jeff Nowicki <jnowicki>|
|Component:||Storage||Assignee:||Jonathan Dobson <jdobson>|
|Storage sub component:||Storage||QA Contact:||Chao Yang <chaoyang>|
|Status:||CLOSED ERRATA||Docs Contact:|
|Priority:||high||CC:||aos-bugs, arahamad, chaoyang, cschaefe, jdobson, jsafrane, rtheis|
|Fixed In Version:||Doc Type:||No Doc Update|
|Doc Text:||Story Points:||---|
|:||2060557 (view as bug list)||Environment:|
|Last Closed:||2022-08-10 10:52:11 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
|Bug Blocks:||2060557, 2061483|
Description Jeff Nowicki 2022-03-03 16:31:30 UTC
Description of problem: OpenShift 4.10 IPI install should ensure that "ibmcloud vpc csi driver" is only installed for IBM Cloud when "controlPlaneTopology" (see infrastructure resource) is set to internal (or NOT external). This was discovered during IBM ROKS 4.10 bringup (PR tests where breaking due to installation errors related to this issue). The following components were installed (incorrectly) on a "classic infrastructure" IBM ROKS 4.10 cluster. openshift-cluster-csi-drivers ibm-vpc-block-csi-controller-7f6958b-l66mb 0/5 ContainerCreating 0 46h openshift-cluster-csi-drivers ibm-vpc-block-csi-driver-operator-56bf948469-8fscf 1/1 Running 0 46h openshift-cluster-csi-drivers ibm-vpc-block-csi-node-d6rts 0/3 Init:0/1 0 46h openshift-cluster-csi-drivers ibm-vpc-block-csi-node-lf48n 0/3 Init:0/1 0 46h openshift-cluster-csi-drivers ibm-vpc-block-csi-node-q72kc 0/3 Init:0/1 0 46h Version-Release number of selected component (if applicable): 4.10 How reproducible: IBM Cloud ROKS 4.10 PR testing - please work with IBM (jnowicki) to recreate/validate. Steps to Reproduce: 1. Run IBM Cloud ROKS 4.10 PR tests Actual results: PR tests are failing. Expected results: PR tests succeed. Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
Comment 1 Jeff Nowicki 2022-03-03 16:40:51 UTC
Discussion thread in CoreOS/ipi-upi-ibm-cloud slack channel: https://coreos.slack.com/archives/C01U40AM37F/p1646318513793049 Suggestion from Jan (in slack thread): We could add some hook to CSIOperatorConfig and call it in shouldRunController with the current infrastructure. The hook for IBMCould would allow installation of the driver only when the platform != external
Comment 2 Jonathan Dobson 2022-03-03 18:45:04 UTC
Discussed with Jeff that we'll not call it a blocker for 4.10, but a priority fix for 4.10.1. They can workaround it for now.
Comment 5 Jeff Nowicki 2022-03-08 16:43:01 UTC
@chaoyang Would you be able to prioritize verifying this BZ (marking it verified so we can get the 4.10 cherry-pick PR merged? The RH verification test should be to verify that the fix did not break an IPI install. @jdobson verified: see https://coreos.slack.com/archives/C01U40AM37F/p1646672454908649?thread_ts=1646318513.793049&cid=C01U40AM37F (from jonathan) "I did at least do an IPI install with those changes on 4.11, made sure the operator/driver got deployed and could provision PV's. QE could certainly do something similar to verify it doesn't break unmanaged openshift." IBM Cloud ROKS (managed openshift) can only test once this fix get's into a release build. Thank you.
Comment 6 Jonathan Dobson 2022-03-08 17:10:26 UTC
Adding needinfo for Chao on Jeff's question above.
Comment 7 Chao Yang 2022-03-09 12:04:39 UTC
oc get pods -n openshift-cluster-csi-drivers NAME READY STATUS RESTARTS AGE ibm-vpc-block-csi-controller-786656b5ff-f2cgt 5/5 Running 4 (110m ago) 120m ibm-vpc-block-csi-driver-operator-cd9cc677c-hmjht 1/1 Running 0 120m ibm-vpc-block-csi-node-8v9kr 3/3 Running 0 114m ibm-vpc-block-csi-node-cbhk7 3/3 Running 0 120m ibm-vpc-block-csi-node-d9gkf 3/3 Running 0 113m ibm-vpc-block-csi-node-mhcnm 3/3 Running 0 113m ibm-vpc-block-csi-node-xbdq8 3/3 Running 0 120m ibm-vpc-block-csi-node-z9gf7 3/3 Running 0 120m Regression test is passed oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-03-08-191358 True False 98m Cluster version is 4.11.0-0.nightly-2022-03-08-191358
Comment 8 Richard Theis 2022-03-18 13:01:33 UTC
Thank you. We have verified the fix on Red Hat OpenShift on IBM Cloud version 4.10.
Comment 10 errata-xmlrpc 2022-08-10 10:52:11 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069