Description of problem: When Cinder and Nova availability zones are different, we can't provision a pod with the attached volume. Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m5s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-fda419fc-5dbc-4878-ab89-5cb1541a33a5" node affinity doesn't match node "ostest-dl27b-worker-0-cbdbh": no matching NodeSelectorTerms Warning FailedScheduling 2m5s default-scheduler 0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. Warning FailedScheduling 2m2s default-scheduler 0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. It happens because Cinder CSI driver adds a node affinity to the created PV like: Node Affinity: Required Terms: Term 0: topology.cinder.csi.openstack.org/zone in [AZ1] To avoid this we need to set `ignore-volume-az = true` in the driver config. How reproducible: Always Steps to Reproduce: Create next objects in the cluster: apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-standard provisioner: cinder.csi.openstack.org parameters: availability: AZ1 volumeBindingMode: WaitForFirstConsumer allowedTopologies: - matchLabelExpressions: - key: topology.cinder.csi.openstack.org/zone values: - "AZ-0" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc1 spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-standard --- apiVersion: v1 kind: Pod metadata: name: app spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata volumes: - name: mydata persistentVolumeClaim: claimName: pvc1 readOnly: false AZ-0 is a Nova AZ AZ1 is a Cinder AZ Actual results: Provisioning of the Pod fails because of scheduling issues Expected results: PV should be provisioned and the Pod is in the Running state Additional info: Upstream Issue: https://github.com/kubernetes/cloud-provider-openstack/issues/1300
https://bugzilla.redhat.com/show_bug.cgi?id=1936871 Verified on 4.8.0-0.nightly-2021-06-08-034312 over OSP16.1 (RHOS-16.1-RHEL-8-20210323.n.0) Test #1. One compute zone and three volume zones with different names all of them. rootVolumes enabled for all the nodes. install-config includes: compute: - name: worker platform: openstack: zones: ['AZ-0', 'AZ-0', 'AZ-0'] additionalNetworkIDs: [] rootVolume: size: 25 type: tripleo zones: ['cinderAZ0', 'cinderAZ1', 'cinderAZ0'] replicas: 3 controlPlane: name: master platform: openstack: zones: ['AZ-0', 'AZ-0', 'AZ-0'] rootVolume: size: 25 type: tripleo zones: ['cinderAZ0', 'cinderAZ1', 'cinderAZ0'] replicas: 3 where the project has below : (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | AZ-0 | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | | cinderAZ0 | available | | cinderAZ1 | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-03T15:30:21Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "13561" uid: 5d1ab6c0-d4b7-44a2-95ce-073869da611e Manual test: Loading below manifests, the pods go to run status when they are using PVCs created on different cinder AZs: $ cat test_pvc_azs.yaml --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-cinder-az0 provisioner: cinder.csi.openstack.org parameters: availability: cinderAZ0 volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-cinder-az0 namespace: demo spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-cinder-az0 --- apiVersion: apps/v1 kind: Deployment metadata: name: demo-0 namespace: demo spec: replicas: 1 selector: matchLabels: app: demo-0 cinder-az: cinderAZ0 nova-az: AZ-0 template: metadata: labels: app: demo-0 cinder-az: cinderAZ0 nova-az: AZ-0 spec: containers: - name: demo image: quay.io/kuryr/demo ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata nodeSelector: topology.cinder.csi.openstack.org/zone: AZ-0 volumes: - name: mydata persistentVolumeClaim: claimName: pvc-cinder-az0 readOnly: false $ cat test_pvc_azs2.yaml --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-cinder-az1 provisioner: cinder.csi.openstack.org parameters: availability: cinderAZ1 volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-cinder-az1 namespace: demo spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-cinder-az1 --- apiVersion: apps/v1 kind: Deployment metadata: name: demo-1 namespace: demo spec: replicas: 1 selector: matchLabels: app: demo-1 template: metadata: labels: app: demo-1 spec: containers: - name: demo image: quay.io/kuryr/demo ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata nodeSelector: topology.cinder.csi.openstack.org/zone: AZ-0 volumes: - name: mydata persistentVolumeClaim: claimName: pvc-cinder-az1 readOnly: false $ oc get pods -n demo -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES demo-0-857fb67fb7-v6tqp 1/1 Running 0 4d23h 10.129.2.20 ostest-wjzt5-worker-2-cccmm <none> <none> demo-1-7859fdc774-d96p2 1/1 Running 0 4d23h 10.129.2.19 ostest-wjzt5-worker-2-cccmm <none> <none> $ oc get machines -A NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ostest-wjzt5-master-0 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-master-1 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-master-2 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-worker-0-4gjsv Running m4.xlarge regionOne AZ-0 5d openshift-machine-api ostest-wjzt5-worker-1-7qqcw Running m4.xlarge regionOne AZ-0 5d openshift-machine-api ostest-wjzt5-worker-2-cccmm Running m4.xlarge regionOne AZ-0 5d $ oc get pvc -n demo NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-cinder-az0 Bound pvc-f5b28358-c883-415f-95d5-51ef458e85a8 1Gi RWO topology-aware-cinder-az0 4d23h pvc-cinder-az1 Bound pvc-174a5a6e-e66e-4b30-9f7d-e085e2039f29 1Gi RWO topology-aware-cinder-az1 4d23h $ openstack volume show pvc-f5b28358-c883-415f-95d5-51ef458e85a8 -c availability_zone +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ0 | +-------------------+-----------+ $ openstack volume show pvc-174a5a6e-e66e-4b30-9f7d-e085e2039f29 -c availability_zone +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ1 | +-------------------+-----------+ All csi test suite [1] passed setting the availabilty parameter to 'cinderAZ0' on the StorageClasses but the two TCs mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1917710. Test #2. 3 compute zones and one single volume zone with different names. install-config.yaml includes compute: - name: worker platform: openstack: zones: ['AZ-0','AZ-1','AZ-2'] additionalNetworkIDs: [] replicas: 3 controlPlane: name: master platform: openstack: zones: ['AZ-0','AZ-1','AZ-2'] replicas: 3 where the project has below : (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | AZ-0 | available | | AZ-1 | available | | AZ-2 | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-08T10:10:55Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "6576" uid: 0eb890e7-e084-443e-9f48-4dc77aa86c07 Test #3. one compute zone and 3 volume zones. Same name on first compute and volume zone. install-config.yaml includes: compute: - name: worker platform: openstack: zones: ['nova'] additionalNetworkIDs: [] replicas: 3 controlPlane: name: master platform: openstack: zones: ['nova'] replicas: 3 (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | | cinderAZ0 | available | | cinderAZ1 | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-08T15:39:08Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "5403" uid: f27b4ce2-0319-46b3-88fc-24c8886c8170 Test #4. one compute zone and one volume zone with same name: compute: - name: worker platform: openstack: zones: [] additionalNetworkIDs: [] replicas: 2 controlPlane: name: master platform: openstack: zones: [] replicas: 3 (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+---------------+ | Zone Name | Zone Status | +-----------+---------------+ | nova | available | | cinderAZ0 | not available | | cinderAZ1 | not available | +-----------+---------------+ Unexpectedly, the flag is enabled so below bz has been filed: https://bugzilla.redhat.com/show_bug.cgi?id=1969945 However, the impact of having the flag enabled is minor. [1] https://github.com/openshift/openstack-cinder-csi-driver-operator/blob/master/hack/e2e.sh
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438