The Cinder CSI driver operator uses a static cloud.conf file [1]. This means that it's currently not possible to set options [2] for the driver. This is essentially the same problem we have with the cloud provider (bug 2049775) but with Cinder CSI. We should use a similar strategy as detailed in the cloud-provider's cloud.conf upgrade enhancement proposal [3] for the cloud.conf used by Cinder CSI. [1] https://github.com/openshift/openstack-cinder-csi-driver-operator/blob/master/assets/configmap.yaml [2] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#driver-config [3] https://github.com/openshift/enhancements/pull/1009
Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing
This BZ solves the following Jira RFE https://issues.redhat.com/browse/RFE-2587
Verified on 4.11.0-0.nightly-2022-06-06-025509 on top of RHOS-16.2-RHEL-8-20220311.n.1. On a cluster that requires the parameter 'ignore-volume-az = yes' enabled, it is possible now to disable it by user and it is confirmed that the change is applied. *Note1: rescan-on-resize: Configuring this param breaks the cluster. Kubelet do not accept it and the node is stuck in notReady. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2077933. *Note2: node-volume-attach-limit: Despite it is appearing on the cloud-conf, the change does not look to be apllied. Creating more pvcs than the limit on the same worker is accepted. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2094829 Verification steps: Cluster running with multiAZ+rootVolumes+Manila after successful IPI installation: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-06-025509 True False 5m17s Cluster version is 4.11.0-0.nightly-2022-06-06-025509 $ oc get machines -A NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ostest-6w4hg-master-0 Running m4.xlarge regionOne AZhci-0 62m openshift-machine-api ostest-6w4hg-master-1 Running m4.xlarge regionOne AZhci-2 62m openshift-machine-api ostest-6w4hg-master-2 Running m4.xlarge regionOne AZhci-1 62m openshift-machine-api ostest-6w4hg-worker-0-tzjpg Running m4.xlarge regionOne AZhci-0 54m openshift-machine-api ostest-6w4hg-worker-1-rjx42 Running m4.xlarge regionOne AZhci-2 54m openshift-machine-api ostest-6w4hg-worker-2-hmrvk Running m4.xlarge regionOne AZhci-1 54m $ openstack server list --long -c Name -c Networks -c 'Availability Zone' +-----------------------------+--------------------------------------------------------------+-------------------+ | Name | Networks | Availability Zone | +-----------------------------+--------------------------------------------------------------+-------------------+ | ostest-6w4hg-worker-0-tzjpg | StorageNFS=172.17.5.168; ostest-6w4hg-openshift=10.196.2.213 | AZhci-0 | | ostest-6w4hg-worker-2-hmrvk | StorageNFS=172.17.5.184; ostest-6w4hg-openshift=10.196.2.71 | AZhci-1 | | ostest-6w4hg-worker-1-rjx42 | StorageNFS=172.17.5.221; ostest-6w4hg-openshift=10.196.0.22 | AZhci-2 | | ostest-6w4hg-master-2 | ostest-6w4hg-openshift=10.196.2.220 | AZhci-1 | | ostest-6w4hg-master-1 | ostest-6w4hg-openshift=10.196.2.80 | AZhci-2 | | ostest-6w4hg-master-0 | ostest-6w4hg-openshift=10.196.0.205 | AZhci-0 | +-----------------------------+--------------------------------------------------------------+-------------------+ $ for i in $(openstack volume list -c Name -f value); do echo "# $i"; openstack volume show $i -c availability_zone -c name -c description -f value; echo ; done # ostest-6w4hg-worker-0-tzjpg-root cinderAZ0 Root volume for ostest-6w4hg-worker-0-tzjpg ostest-6w4hg-worker-0-tzjpg-root # ostest-6w4hg-worker-2-hmrvk-root cinderAZ0 Root volume for ostest-6w4hg-worker-2-hmrvk ostest-6w4hg-worker-2-hmrvk-root # ostest-6w4hg-worker-1-rjx42-root cinderAZ1 Root volume for ostest-6w4hg-worker-1-rjx42 ostest-6w4hg-worker-1-rjx42-root # ostest-6w4hg-master-1 cinderAZ1 Created By OpenShift Installer ostest-6w4hg-master-1 # ostest-6w4hg-master-2 cinderAZ0 Created By OpenShift Installer ostest-6w4hg-master-2 # ostest-6w4hg-master-0 cinderAZ0 Created By OpenShift Installer ostest-6w4hg-master-0 Existing config before performing any change: $ oc get cm cloud-provider-config -n openshift-config -o yaml apiVersion: v1 data: ca-bundle.pem: | -----BEGIN CERTIFICATE----- MIIEjjCCAvagAwIBAgIBATANBgkqhkiG9w0BAQsFADA3MRUwEwYDVQQKDAxSRURI QVQuTE9DQUwxHjAcBgNVBAMMFUNlcnRpZmljYXRlIEF1dGhvcml0eTAeFw0yMjA2 MDMyMDA2MTFaFw00MjA2MDMyMDA2MTFaMDcxFTATBgNVBAoMDFJFREhBVC5MT0NB TDEeMBwGA1UEAwwVQ2VydGlmaWNhdGUgQXV0aG9yaXR5MIIBojANBgkqhkiG9w0B AQEFAAOCAY8AMIIBigKCAYEA1Etjh3AD96m9m7+SSo34m4LED1e8kfGOHDxWZju+ DlxqRW/ziS7pscGobgH9I5En7ALmBo68kx1Lq9XA9epDv63spuwJzYHS/L8v3+0l 7RdBe/BeoWcbLha9QcWSaLYkR45hyZF1apHto5xutYTV4VBiUzNCQWoXhg0FaP/t qNkLM/CURuMI6LX50odl3IUFgiF3+/j4F5EJzApfU2bBMXXXn6Tt5PkXysrjRitz nfPd/j+Ygw8LJEiTz8fl6qysXjyeWgovurBGcfL1OZt29G7bwMu7XRpTxsD6JcNp 3KT6RTkS9U/9YQeFM320meJ1Ieuh/FZk7Mt/yZaPVOE+pl01deINWHtk5eP5sgu0 3ivI3VCqjAaP0SYAdEBNvo4A3cN9Kh/g4B/ihDScMpR5vNjwJBTHRIV3qMdNvJUW 171NuDbT6mHe/LMQMaHWaK86zUtkyAg3INjk1vY6rJNAZw9sTY4OLW0I/kNE3bHK 9WQWMlf/WZIJKF/gje3x1pjfAgMBAAGjgaQwgaEwHwYDVR0jBBgwFoAUAp8QxbZh Qa5wdoVf+PB2MjE7W6swDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAcYw HQYDVR0OBBYEFAKfEMW2YUGucHaFX/jwdjIxO1urMD4GCCsGAQUFBwEBBDIwMDAu BggrBgEFBQcwAYYiaHR0cDovL2lwYS1jYS5yZWRoYXQubG9jYWwvY2Evb2NzcDAN BgkqhkiG9w0BAQsFAAOCAYEAn6DTN9hlQOCifBR1kSHywd3wJOnrUUCeKGs6I8xK LSMyiHKBdPe7zt4L1/yL6H8KQayzThgKR2rUCdUn7eFXgbXcK5GYAuJ82AZPxb4H mxB4CLNCkNAKNCKn6pjHZa39wnnOjdTPCjSiklk1lkZyiTNeiE37wuWA469wugNE 5o/rcS0UM1BAT6dLcFHOJPWm1J1aXDBuhHYl7e3wWjHAR5QwijvMUnguAvu3Qber LHTBxqD/qN2fR1WmfVZ2NVu5t8eAzzJOBlJs/eTA6gBLUgLgA38mx1i67wSbAclC b/9gIUZKKr2ZSB+gmkDtkbtznql9NMO0NLdwGJqdlvfFrG+WM8ZV8MuUChoUgS4P kvZyAy8/e0gRLi5WH5ig+JvTdknZN+eE2UL9JKnNpefXskDljQkltXrGiKnwpIPk bxYCbFNx53aoWw1pjQu+2zodoRE5x2KpHOOvKrKk7eiz02qk4vgVfWMq0qwat9or 9dGditlV0YGYoQxGUn8TJjfQ -----END CERTIFICATE----- config: | [Global] secret-name = openstack-credentials secret-namespace = kube-system region = regionOne ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem [LoadBalancer] use-octavia = True kind: ConfigMap metadata: creationTimestamp: "2022-06-08T09:22:04Z" name: cloud-provider-config namespace: openshift-config resourceVersion: "1864" uid: 3eff3b69-379a-4cfb-8bd8-2b4672bbe11c $ oc get cm -n openshift-cluster-csi-drivers cloud-conf -o yaml apiVersion: v1 data: cloud.conf: |+ [Global] region = regionOne ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [LoadBalancer] use-octavia = True [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2022-06-08T09:26:41Z" name: cloud-conf namespace: openshift-cluster-csi-drivers resourceVersion: "9119" uid: e1ca6a43-8e63-49f3-a5d4-26c8761a2b55 A pod running in a nova AZ using a PVC on a cinder AZ with different names is up and running (meaning the 'ignore-volume-az = yes' is working fine): $ oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES demo-0-78677fb76d-qggxm 1/1 Running 0 2m5s 10.128.2.13 ostest-6w4hg-worker-0-tzjpg <none> <none> $ oc get pvc -o wide NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE pvc-0 Bound pvc-7d214c10-16c9-4d1f-91b7-29668cdae5cb 1Gi RWO topology-aware-0 2m40s Filesystem $ openstack volume show pvc-7d214c10-16c9-4d1f-91b7-29668cdae5cb -c 'availability_zone' +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ0 | +-------------------+-----------+ Performing below change on cloud-provider-config (According to https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#driver-config) with the goal to disable the ignore-volume-az flag: [...] config: | [Global] secret-name = openstack-credentials secret-namespace = kube-system region = regionOne ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem [LoadBalancer] use-octavia = True [BlockStorage] node-volume-attach-limit = 5 ignore-volume-az = no [Metadata] search-order = metadataService kind: ConfigMap metadata: creationTimestamp: "2022-06-07T13:53:31Z" name: cloud-provider-config namespace: openshift-config resourceVersion: "1609" uid: 8cd96602-c562-4d9e-917d-82360428e40f Re-configuration starts: $ date Wed Jun 8 10:39:38 UTC 2022 $ oc get nodes NAME STATUS ROLES AGE VERSION ostest-6w4hg-master-0 Ready,SchedulingDisabled master 76m v1.24.0+bb9c2f1 ostest-6w4hg-master-1 Ready master 76m v1.24.0+bb9c2f1 ostest-6w4hg-master-2 Ready master 76m v1.24.0+bb9c2f1 ostest-6w4hg-worker-0-tzjpg Ready,SchedulingDisabled worker 62m v1.24.0+bb9c2f1 ostest-6w4hg-worker-1-rjx42 Ready worker 63m v1.24.0+bb9c2f1 ostest-6w4hg-worker-2-hmrvk Ready worker 61m v1.24.0+bb9c2f1 Re-configuration ends: $ date Wed Jun 8 11:07:04 UTC 2022 $ oc get nodes NAME STATUS ROLES AGE VERSION ostest-6w4hg-master-0 Ready master 104m v1.24.0+bb9c2f1 ostest-6w4hg-master-1 Ready master 104m v1.24.0+bb9c2f1 ostest-6w4hg-master-2 Ready master 104m v1.24.0+bb9c2f1 ostest-6w4hg-worker-0-tzjpg Ready worker 89m v1.24.0+bb9c2f1 ostest-6w4hg-worker-1-rjx42 Ready worker 91m v1.24.0+bb9c2f1 ostest-6w4hg-worker-2-hmrvk Ready worker 89m v1.24.0+bb9c2f1 Change is applied: $ oc get cm -n openshift-cluster-csi-drivers cloud-conf -o yaml apiVersion: v1 data: cloud.conf: |+ [Global] region = regionOne ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [LoadBalancer] use-octavia = True [BlockStorage] node-volume-attach-limit = 5 ignore-volume-az = no kind: ConfigMap metadata: creationTimestamp: "2022-06-08T09:26:41Z" name: cloud-conf namespace: openshift-cluster-csi-drivers resourceVersion: "56619" uid: e1ca6a43-8e63-49f3-a5d4-26c8761a2b55 Creating again the same pod running in a nova AZ using a PVC on a cinder AZ with different names is stuck in pending (meaning the 'ignore-volume-az = no' is working fine): $ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES demo-0-78677fb76d-bwrxq 0/1 Pending 0 82s <none> <none> <none> <none> $ oc get pvc -o wide NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE pvc-0 Bound pvc-18394212-5ac8-47f6-a88d-6036f64dd5ad 1Gi RWO topology-aware-0 86s Filesystem (shiftstack) [stack@undercloud-0 ~]$ openstack volume show pvc-18394212-5ac8-47f6-a88d-6036f64dd5ad -c 'availability_zone' +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ0 | +-------------------+-----------+ $ oc describe pod/demo-0-78677fb76d-bwrxq | tail -5 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m17s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-18394212-5ac8-47f6-a88d-6036f64dd5ad" node affinity doesn't match node "ostest-6w4hg-worker-0-tzjpg": no matching NodeSelectorTerms Warning FailedScheduling 2m15s default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 4 node(s) didn't match Pod's node affinity/selector, 6 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069