Bug 1936871
| Summary: | [Cinder CSI] Topology aware provisioning doesn't work when Nova and Cinder AZs are different | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fedosin <mfedosin> |
| Component: | Storage | Assignee: | Mike Fedosin <mfedosin> |
| Storage sub component: | OpenStack CSI Drivers | QA Contact: | rlobillo |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | aos-bugs, jsafrane, juriarte, pprinett, rlobillo |
| Version: | 4.8 | Keywords: | TestBlocker, Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
Feature: In 4.8 Cinder CSI driver operator automatically detects OpenStack cloud parameters related to availability zones, and configures the driver accordingly.
Reason: Previously users couldn't deploy a PV in a volume availability zone that name is different from compute availability zone, because it required additional configuration of Cinder CSI driver.
Result: Users can provision PVs and then mount them to pods in different availability zones.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:51:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
https://bugzilla.redhat.com/show_bug.cgi?id=1936871 Verified on 4.8.0-0.nightly-2021-06-08-034312 over OSP16.1 (RHOS-16.1-RHEL-8-20210323.n.0) Test #1. One compute zone and three volume zones with different names all of them. rootVolumes enabled for all the nodes. install-config includes: compute: - name: worker platform: openstack: zones: ['AZ-0', 'AZ-0', 'AZ-0'] additionalNetworkIDs: [] rootVolume: size: 25 type: tripleo zones: ['cinderAZ0', 'cinderAZ1', 'cinderAZ0'] replicas: 3 controlPlane: name: master platform: openstack: zones: ['AZ-0', 'AZ-0', 'AZ-0'] rootVolume: size: 25 type: tripleo zones: ['cinderAZ0', 'cinderAZ1', 'cinderAZ0'] replicas: 3 where the project has below : (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | AZ-0 | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | | cinderAZ0 | available | | cinderAZ1 | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-03T15:30:21Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "13561" uid: 5d1ab6c0-d4b7-44a2-95ce-073869da611e Manual test: Loading below manifests, the pods go to run status when they are using PVCs created on different cinder AZs: $ cat test_pvc_azs.yaml --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-cinder-az0 provisioner: cinder.csi.openstack.org parameters: availability: cinderAZ0 volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-cinder-az0 namespace: demo spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-cinder-az0 --- apiVersion: apps/v1 kind: Deployment metadata: name: demo-0 namespace: demo spec: replicas: 1 selector: matchLabels: app: demo-0 cinder-az: cinderAZ0 nova-az: AZ-0 template: metadata: labels: app: demo-0 cinder-az: cinderAZ0 nova-az: AZ-0 spec: containers: - name: demo image: quay.io/kuryr/demo ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata nodeSelector: topology.cinder.csi.openstack.org/zone: AZ-0 volumes: - name: mydata persistentVolumeClaim: claimName: pvc-cinder-az0 readOnly: false $ cat test_pvc_azs2.yaml --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-cinder-az1 provisioner: cinder.csi.openstack.org parameters: availability: cinderAZ1 volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-cinder-az1 namespace: demo spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-cinder-az1 --- apiVersion: apps/v1 kind: Deployment metadata: name: demo-1 namespace: demo spec: replicas: 1 selector: matchLabels: app: demo-1 template: metadata: labels: app: demo-1 spec: containers: - name: demo image: quay.io/kuryr/demo ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata nodeSelector: topology.cinder.csi.openstack.org/zone: AZ-0 volumes: - name: mydata persistentVolumeClaim: claimName: pvc-cinder-az1 readOnly: false $ oc get pods -n demo -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES demo-0-857fb67fb7-v6tqp 1/1 Running 0 4d23h 10.129.2.20 ostest-wjzt5-worker-2-cccmm <none> <none> demo-1-7859fdc774-d96p2 1/1 Running 0 4d23h 10.129.2.19 ostest-wjzt5-worker-2-cccmm <none> <none> $ oc get machines -A NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ostest-wjzt5-master-0 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-master-1 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-master-2 Running m4.xlarge regionOne AZ-0 5d1h openshift-machine-api ostest-wjzt5-worker-0-4gjsv Running m4.xlarge regionOne AZ-0 5d openshift-machine-api ostest-wjzt5-worker-1-7qqcw Running m4.xlarge regionOne AZ-0 5d openshift-machine-api ostest-wjzt5-worker-2-cccmm Running m4.xlarge regionOne AZ-0 5d $ oc get pvc -n demo NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-cinder-az0 Bound pvc-f5b28358-c883-415f-95d5-51ef458e85a8 1Gi RWO topology-aware-cinder-az0 4d23h pvc-cinder-az1 Bound pvc-174a5a6e-e66e-4b30-9f7d-e085e2039f29 1Gi RWO topology-aware-cinder-az1 4d23h $ openstack volume show pvc-f5b28358-c883-415f-95d5-51ef458e85a8 -c availability_zone +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ0 | +-------------------+-----------+ $ openstack volume show pvc-174a5a6e-e66e-4b30-9f7d-e085e2039f29 -c availability_zone +-------------------+-----------+ | Field | Value | +-------------------+-----------+ | availability_zone | cinderAZ1 | +-------------------+-----------+ All csi test suite [1] passed setting the availabilty parameter to 'cinderAZ0' on the StorageClasses but the two TCs mentioned on https://bugzilla.redhat.com/show_bug.cgi?id=1917710. Test #2. 3 compute zones and one single volume zone with different names. install-config.yaml includes compute: - name: worker platform: openstack: zones: ['AZ-0','AZ-1','AZ-2'] additionalNetworkIDs: [] replicas: 3 controlPlane: name: master platform: openstack: zones: ['AZ-0','AZ-1','AZ-2'] replicas: 3 where the project has below : (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | AZ-0 | available | | AZ-1 | available | | AZ-2 | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-08T10:10:55Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "6576" uid: 0eb890e7-e084-443e-9f48-4dc77aa86c07 Test #3. one compute zone and 3 volume zones. Same name on first compute and volume zone. install-config.yaml includes: compute: - name: worker platform: openstack: zones: ['nova'] additionalNetworkIDs: [] replicas: 3 controlPlane: name: master platform: openstack: zones: ['nova'] replicas: 3 (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | | cinderAZ0 | available | | cinderAZ1 | available | +-----------+-------------+ Once the cluster is up, ignore-volume-az is set: $ oc get cm -n openshift-cluster-csi-drivers openstack-cinder-config -o yaml apiVersion: v1 data: cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack multiaz-cloud.conf: | [Global] use-clouds = true clouds-file = /etc/kubernetes/secret/clouds.yaml cloud = openstack [BlockStorage] ignore-volume-az = yes kind: ConfigMap metadata: creationTimestamp: "2021-06-08T15:39:08Z" name: openstack-cinder-config namespace: openshift-cluster-csi-drivers resourceVersion: "5403" uid: f27b4ce2-0319-46b3-88fc-24c8886c8170 Test #4. one compute zone and one volume zone with same name: compute: - name: worker platform: openstack: zones: [] additionalNetworkIDs: [] replicas: 2 controlPlane: name: master platform: openstack: zones: [] replicas: 3 (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --compute +-----------+-------------+ | Zone Name | Zone Status | +-----------+-------------+ | nova | available | +-----------+-------------+ (shiftstack) [stack@undercloud-0 ~]$ openstack availability zone list --volume +-----------+---------------+ | Zone Name | Zone Status | +-----------+---------------+ | nova | available | | cinderAZ0 | not available | | cinderAZ1 | not available | +-----------+---------------+ Unexpectedly, the flag is enabled so below bz has been filed: https://bugzilla.redhat.com/show_bug.cgi?id=1969945 However, the impact of having the flag enabled is minor. [1] https://github.com/openshift/openstack-cinder-csi-driver-operator/blob/master/hack/e2e.sh Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |
Description of problem: When Cinder and Nova availability zones are different, we can't provision a pod with the attached volume. Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m5s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-fda419fc-5dbc-4878-ab89-5cb1541a33a5" node affinity doesn't match node "ostest-dl27b-worker-0-cbdbh": no matching NodeSelectorTerms Warning FailedScheduling 2m5s default-scheduler 0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. Warning FailedScheduling 2m2s default-scheduler 0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. It happens because Cinder CSI driver adds a node affinity to the created PV like: Node Affinity: Required Terms: Term 0: topology.cinder.csi.openstack.org/zone in [AZ1] To avoid this we need to set `ignore-volume-az = true` in the driver config. How reproducible: Always Steps to Reproduce: Create next objects in the cluster: apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: topology-aware-standard provisioner: cinder.csi.openstack.org parameters: availability: AZ1 volumeBindingMode: WaitForFirstConsumer allowedTopologies: - matchLabelExpressions: - key: topology.cinder.csi.openstack.org/zone values: - "AZ-0" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc1 spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: topology-aware-standard --- apiVersion: v1 kind: Pod metadata: name: app spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /var/lib/www/data name: mydata volumes: - name: mydata persistentVolumeClaim: claimName: pvc1 readOnly: false AZ-0 is a Nova AZ AZ1 is a Cinder AZ Actual results: Provisioning of the Pod fails because of scheduling issues Expected results: PV should be provisioned and the Pod is in the Running state Additional info: Upstream Issue: https://github.com/kubernetes/cloud-provider-openstack/issues/1300