Bug 2066865

Summary: Flaky test: In-tree Volumes [Driver: azure-disk] [Testpattern: Dynamic PV (delayed binding)] topology should provision a volume and schedule a pod with AllowedTopologies
Product: OpenShift Container Platform Reporter: Jan Safranek <jsafrane>
Component: StorageAssignee: Jan Safranek <jsafrane>
Storage sub component: Kubernetes External Components QA Contact: Wei Duan <wduan>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: aos-bugs, dgoodwin, miabbott
Version: 4.11   
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
[sig-storage] In-tree Volumes [Driver: azure-disk] [Testpattern: Dynamic PV (delayed binding)] topology should provision a volume and schedule a pod with AllowedTopologies [Suite:openshift/conformance/parallel] [Suite:k8s]
Last Closed: 2022-08-10 10:55:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Safranek 2022-03-22 16:18:42 UTC
This test flakes:
[sig-storage] In-tree Volumes [Driver: azure-disk] [Testpattern: Dynamic PV (delayed binding)] topology should provision a volume and schedule a pod with AllowedTopologies [Suite:openshift/conformance/parallel] [Suite:k8s] expand_more


Example:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-azure-upgrade/1506022160437612544

external-provisioner cannot provision a volume, because:

Mar 21 23:41:27.699: INFO: At 2022-03-21 23:36:26 +0000 UTC - event for pvc-cr9kc: {disk.csi.azure.com_ci-op-xszgvzk7-fde6e-l2qb6-master-1_69fb2394-cb4d-4d2a-ad2d-6cb9357d5800 } ProvisioningFailed: failed to provision volume with StorageClass "e2e-topology-4126sl8vc": error generating accessibility requirements: topology map[topology.disk.csi.azure.com/zone:] from selected node "ci-op-xszgvzk7-fde6e-l2qb6-worker-westus-p89fc" is not in requisite: [map[topology.disk.csi.azure.com/zone:1]]

Comment 1 Jan Safranek 2022-03-22 16:21:29 UTC
This may be unrelated: I can see that nodes in that failed run have weird topology labels:

                    "topology.disk.csi.azure.com/zone": "",
                    "topology.kubernetes.io/region": "westus",
                    "topology.kubernetes.io/zone": "0"

In a successful run it looks differently:
                    "topology.disk.csi.azure.com/zone": "centralus-1",
                    "topology.kubernetes.io/region": "centralus",
                    "topology.kubernetes.io/zone": "centralus-1"

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-azure-upgrade/1502150912049680384

Comment 2 Jan Safranek 2022-03-22 16:49:26 UTC
pull-ci-openshift-installer-master-e2e-azure-upi fails pretty constantly since 03/02/2022

The first failing one: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_installer/5674/pull-ci-openshift-installer-master-e2e-azure-upi/1499128361434222592
The last successful one: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_installer/5665/pull-ci-openshift-installer-master-e2e-azure-upi/1498979872859492352

(there were couple of failed installs in between)

Comment 3 Jan Safranek 2022-03-22 16:54:47 UTC
[ e2e-azure-upi linked in the previous comment may be unrelated to flake in comment #0, they use a different install workflow ]

Comment 4 Jan Safranek 2022-03-24 12:57:30 UTC
"westus" region is special - it does not have availability zones. The test creates StorageClass requesting explicit topology discovered from (in-tree) Node labels:

allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - "0"
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: e2e-topology-84vk7sp
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

PVC created by the test:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    volume.beta.kubernetes.io/storage-provisioner: disk.csi.azure.com
    volume.kubernetes.io/selected-node: ci-op-9j1nby77-e54d5-9zlwv-worker-westus-ljw4g
    volume.kubernetes.io/storage-provisioner: disk.csi.azure.com
  creationTimestamp: "2022-03-24T12:36:08Z"
  finalizers:
  - kubernetes.io/pvc-protection
  generateName: pvc-
  name: pvc-ph9tb
  namespace: e2e-topology-84
  resourceVersion: "135683"
  uid: 7596de95-b733-46d3-93f2-f1309a244d20
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: e2e-topology-84vk7sp
  volumeMode: Filesystem

And external-provisioner cannot compute topology requirements from the SC + PVC.Annotations["volume.kubernetes.io/selected-node"]:

I0324 12:36:09.194043       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"e2e-topology-84", Name:"pvc-ph9tb", UID:"7596de95-b733-46d3-93f2-f1309a244d20", APIVersion:"v1", ResourceVersion:"135683", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "e2e-topology-84vk7sp": error generating accessibility requirements: topology map[topology.disk.csi.azure.com/zone:] from selected node "ci-op-9j1nby77-e54d5-9zlwv-worker-westus-ljw4g" is not in requisite: [map[topology.disk.csi.azure.com/zone:0]]

Comment 5 Jan Safranek 2022-03-24 13:25:05 UTC
Upstream issue: https://github.com/kubernetes/kubernetes/issues/108980

Comment 6 Jan Safranek 2022-03-31 08:24:52 UTC
Upstream PR to fix the CSI translation: https://github.com/kubernetes/kubernetes/pull/109154

It will not fix the issue completely. In Azure regions without availability zones the in-tree volume plugin (and cloud provider) uses machine's failure domain instead of availability zones. I don't understand how nodes are distributed among failure domains, to me it seems it's completely random.

The CSI driver does not use the failure domains and uses `zone: ""` in these regions, which expresses volume topology better - a PV in such region can be used by any node.

From perspective of the failing test, it looks like there are real in-tree availability zones and PVs in one zone can't be used in the other one, but that's not true in reality. The tests expect that if they provision a volume in failure domain "1" (via StorageClass.allowedTopologies), a PV gets provisioned there and only nodes from that failure domain can use it. But such topology requirement is translated to CSI as "" and the provisioned PV can be used in any failure domain and sometimes a node with domain "0" is chosen. The test then fails, while there is no error on Kubernetes / OCP side. I am going to skip that tests for in-tree volume plugin on Azure and keep it for the CSI driver.

Comment 7 Jan Safranek 2022-04-05 13:18:31 UTC
I need to update openshift/origin.

Comment 13 errata-xmlrpc 2022-08-10 10:55:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069