Bug 2073617 - [IBM] allowedTopologies in SC causes scheduling to fail when region is empty
Summary: [IBM] allowedTopologies in SC causes scheduling to fail when region is empty
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.12.0
Assignee: Jonathan Dobson
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-08 23:10 UTC by Jonathan Dobson
Modified: 2023-01-17 19:48 UTC (History)
0 users

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:48:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ibm-vpc-block-csi-driver pull 14 0 None open Rebase: ibm-vpc-block-csi-driver v4.3.0 2022-05-24 21:43:42 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:48:35 UTC

Description Jonathan Dobson 2022-04-08 23:10:10 UTC
Description of problem:
Setting allowedTopologies in the storage class with a match on "failure-domain.beta.kubernetes.io/zone" results in "failure-domain.beta.kubernetes.io/region" also being set to an empty string on the PV, which causes scheduling of the pod to fail.

Version-Release number of selected component (if applicable):
$ oc version
Client Version: 4.10.8
Server Version: 4.10.8
Kubernetes Version: v1.23.5+1f952b3

How reproducible:
Always

Steps to Reproduce:

$ cat allowedtopo_sc_pvc_pod.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: topology-sc
parameters:
  csi.storage.k8s.io/fstype: ext4
  encrypted: "false"
  encryptionKey: ""
  profile: 10iops-tier
  region: ""
  resourceGroup: ""
  tags: ""
  zone: ""
provisioner: vpc.block.csi.ibm.io
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - us-south-1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: topology-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: topology-sc
---
apiVersion: v1
kind: Pod
metadata:
  name: example
  labels:
    app: example
spec:
  volumes:
  - name: vol1
    persistentVolumeClaim:
      claimName: topology-pvc
  containers:
  - image: busybox
    command:
      - "sleep"
      - "604800"
    imagePullPolicy: IfNotPresent
    name: busybox
    volumeMounts:
      - mountPath: "/data"
        name: vol1
  restartPolicy: Always

$ oc apply -f allowedtopo_sc_pvc_pod.yaml 
storageclass.storage.k8s.io/topology-sc created
persistentvolumeclaim/topology-pvc created
pod/example created


Actual results:

The pod fails to be scheduled because the region on the PV must match an empty string.

$ oc describe pods | tail -6
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  2m7s                default-scheduler  0/6 nodes are available:
 6 pod has unbound immediate PersistentVolumeClaims.
  Warning  FailedScheduling  99s (x1 over 2m6s)  default-scheduler  0/6 nodes are available:
 6 pod has unbound immediate PersistentVolumeClaims.
  Warning  FailedScheduling  28s                 default-scheduler  0/6 nodes are available:
 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict.

$ oc get pv -o yaml | grep -A 11 nodeAffinity
    nodeAffinity:
      required:
        nodeSelectorTerms:
        - matchExpressions:
          - key: failure-domain.beta.kubernetes.io/region
            operator: In
            values:
            - ""
          - key: failure-domain.beta.kubernetes.io/zone
            operator: In
            values:
            - us-south-1


Expected results:

If the region is not specified under allowedTopologies, it should not be set to an empty string on the PV, and the pod should be scheduled.


Master Log:

Node Log (of failed PODs):

PV Dump:

$ oc get pv -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolume
  metadata:
    annotations:
      pv.kubernetes.io/provisioned-by: vpc.block.csi.ibm.io
    creationTimestamp: "2022-04-08T22:57:30Z"
    finalizers:
    - kubernetes.io/pv-protection
    name: pvc-e17b0dfd-80e3-4133-a7f5-72d98317371a
    resourceVersion: "2031920"
    uid: defcff04-b6b9-4e80-8208-fa7cba6da8b1
  spec:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 10Gi
    claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: topology-pvc
      namespace: default
      resourceVersion: "2031752"
      uid: e17b0dfd-80e3-4133-a7f5-72d98317371a
    csi:
      driver: vpc.block.csi.ibm.io
      fsType: ext4
      volumeAttributes:
        clusterID: ""
        failure-domain.beta.kubernetes.io/region: ""
        failure-domain.beta.kubernetes.io/zone: us-south-1
        iops: "3000"
        storage.kubernetes.io/csiProvisionerIdentity: 1649109173235-8081-vpc.block.csi.ibm.io
        tags: ""
        volumeCRN: crn:v1:bluemix:public:is:us-south-1:a/18240e5ed3f647cb96ca52dc92d9addd::volume:r006-741ec11e-8cf3-4830-bcaf-fafdf929fb4e
        volumeId: r006-741ec11e-8cf3-4830-bcaf-fafdf929fb4e
      volumeHandle: r006-741ec11e-8cf3-4830-bcaf-fafdf929fb4e
    nodeAffinity:
      required:
        nodeSelectorTerms:
        - matchExpressions:
          - key: failure-domain.beta.kubernetes.io/region
            operator: In
            values:
            - ""
          - key: failure-domain.beta.kubernetes.io/zone
            operator: In
            values:
            - us-south-1
    persistentVolumeReclaimPolicy: Delete
    storageClassName: topology-sc
    volumeMode: Filesystem
  status:
    phase: Bound
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


PVC Dump:

$ oc get pvc -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name
":"topology-pvc","namespace":"default"},"spec":{"accessModes":["ReadWriteOnce"],"resources":
{"requests":{"storage":"10Gi"}},"storageClassName":"topology-sc"}}
      pv.kubernetes.io/bind-completed: "yes"
      pv.kubernetes.io/bound-by-controller: "yes"
      volume.beta.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io
      volume.kubernetes.io/storage-provisioner: vpc.block.csi.ibm.io
    creationTimestamp: "2022-04-08T22:57:03Z"
    finalizers:
    - kubernetes.io/pvc-protection
    name: topology-pvc
    namespace: default
    resourceVersion: "2031923"
    uid: e17b0dfd-80e3-4133-a7f5-72d98317371a
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 10Gi
    storageClassName: topology-sc
    volumeMode: Filesystem
    volumeName: pvc-e17b0dfd-80e3-4133-a7f5-72d98317371a
  status:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 10Gi
    phase: Bound
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""


StorageClass Dump (if StorageClass used by PV/PVC):

$ oc get sc topology-sc -o yaml
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - us-south-1
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"allowVolumeExpansion":true,"allowedTopologies":[{"matchLabelExpressions":[{"key":"failure-domain.beta.kubernetes.io/zone","values":["us-south-1"]}]}],"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"topology-sc"},"parameters":{"csi.storage.k8s.io/fstype":"ext4","encrypted":"false","encryptionKey":"","profile":"10iops-tier","region":"","resourceGroup":"","tags":"","zone":""},"provisioner":"vpc.block.csi.ibm.io","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"}
  creationTimestamp: "2022-04-08T22:57:03Z"
  name: topology-sc
  resourceVersion: "2031750"
  uid: 24fe2e9c-0b85-4934-a9f0-14d8dc88211f
parameters:
  csi.storage.k8s.io/fstype: ext4
  encrypted: "false"
  encryptionKey: ""
  profile: 10iops-tier
  region: ""
  resourceGroup: ""
  tags: ""
  zone: ""
provisioner: vpc.block.csi.ibm.io
reclaimPolicy: Delete
volumeBindingMode: Immediate


Additional info:

If I do set both zone and region in allowedTopologies, then the pod gets scheduled successfully.
But the e2e tests only set the zone, and this is triggering a failure.

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/24720/rehearse-24720-pull-ci-openshift-ibm-vpc-block-csi-driver-operator-master-e2e-ibmcloud-csi/1508442934234583040

If needed, you can run this test on an existing cluster with:

git clone https://github.com/openshift/kubernetes.git
cd kubernetes
make ginkgo
make WHAT=k8s.io/kubernetes/test/e2e/e2e.test
export KUBERNETES_PROVIDER=skeleton
./hack/ginkgo-e2e.sh --provider=$KUBERNETES_PROVIDER --ginkgo.focus="External.Storage.*vpc\.block\.csi\.ibm\.io.*Dynamic.PV.*binding.*topology.should.provision.a.volume.and.schedule.a.pod.with.AllowedTopologies" --storage.testdriver=$PWD/manifest.yaml

manifest.yaml comes from the operator repo, but set "topology: true" to enable the test.
https://github.com/openshift/ibm-vpc-block-csi-driver-operator/blob/a21b5e98baaa73d43aaceea651c80ef79ba42bd2/test/e2e/manifest.yaml#L30

Comment 1 Jonathan Dobson 2022-04-08 23:12:53 UTC
Note: once this bug is fixed then we should update the test manifest to set "topology: true" again.
https://github.com/openshift/ibm-vpc-block-csi-driver-operator/blob/a21b5e98baaa73d43aaceea651c80ef79ba42bd2/test/e2e/manifest.yaml#L30

Comment 3 Jonathan Dobson 2022-05-09 20:48:38 UTC
This should be fixed in v4.3.0 upstream now:
https://github.com/kubernetes-sigs/ibm-vpc-block-csi-driver/releases/tag/v4.3.0
Assigning back to myself for the rebase downstream.

Comment 4 Jonathan Dobson 2022-05-24 21:43:42 UTC
The fix is included in this rebase PR:
https://github.com/openshift/ibm-vpc-block-csi-driver/pull/14

Comment 5 Chao Yang 2022-06-28 07:53:17 UTC
 oc get pv/pvc-55148495-ad92-47ae-9b25-8a147d3421ef -o yaml | grep -A 11 nodeAffinity
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/region
          operator: In
          values:
          - eu-gb
        - key: failure-domain.beta.kubernetes.io/zone
          operator: In
          values:
          - eu-gb-3
gb@gbtekiMacBook-Air ibm % oc get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE     IP            NODE                                NOMINATED NODE   READINESS GATES
example   1/1     Running   0          5m43s   10.129.2.11   qe-chaoibm28-njn2m-worker-3-vsvvb   <none>           <none>

Comment 6 Jonathan Dobson 2022-06-29 15:32:09 UTC
Rebase PR merged.

Comment 11 errata-xmlrpc 2023-01-17 19:48:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.