Verified with OCP build 4.4.0-0.nightly-2020-06-23-102753, steps see below, Before we verify the bug, checking current status of apiservers as below, kube-apiservers restarted 4 times, this is because bug 1837992 is not backported to 4.4. $ oc get pods -A | grep -E 'apiserver|NAME' | grep -vE 'installer|revision|catalog' NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-7d68cd5574-dl49s 1/1 Running 2 65m openshift-apiserver apiserver-6b4776d799-cr5pq 1/1 Running 0 55m openshift-apiserver apiserver-6b4776d799-gwxlw 1/1 Running 0 57m openshift-apiserver apiserver-6b4776d799-jrh8n 1/1 Running 0 56m openshift-kube-apiserver-operator kube-apiserver-operator-7c98b4cd9f-6gnzr 1/1 Running 2 65m openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-0 4/4 Running 3 41m openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-1 4/4 Running 5 46m openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-2 4/4 Running 5 44m - Creating one sc and pvc on non-zoned region, $ cat sc-non-zoned.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: labels: kubernetes.io/cluster-service: "true" name: managed-premium-nonzoned parameters: kind: Managed storageaccounttype: Premium_LRS zoned: "false" provisioner: kubernetes.io/azure-disk volumeBindingMode: WaitForFirstConsumer $ oc apply -f sc-non-zoned.yaml storageclass.storage.k8s.io/managed-premium-nonzoned created $ oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE managed-premium (default) kubernetes.io/azure-disk Delete WaitForFirstConsumer true 1h managed-premium-nonzoned kubernetes.io/azure-disk Delete WaitForFirstConsumer false 56m $ cat pvc-nonzoned.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: azure-managed-non spec: accessModes: - ReadWriteOnce storageClassName: managed-premium-nonzoned resources: requests: storage: 5Gi $ oc apply -f pvc-non-zoned.yaml persistentvolumeclaim/azure-managed-non created $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE azure-managed-non Bound pvc-7e772bd1-1ca3-4edb-b443-2dea0f2bb76e 5Gi RWO managed-premium-nonzoned 56m $ cat mypod-non-zoned.yaml kind: Pod apiVersion: v1 metadata: name: mypod spec: containers: - name: mypod image: nginx:1.15.5 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 250m memory: 256Mi volumeMounts: - mountPath: "/mnt/azure" name: volume volumes: - name: volume persistentVolumeClaim: claimName: azure-managed-non $ oc create -f mypod-non-zoned.yaml pod/mypod created Checked the created pod status, $ oc describe pod/mypod Name: mypod Namespace: default ... Status: Pending ... Conditions: Type Status PodScheduled False Volumes: volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: azure-managed-non ReadOnly: false default-token-gwhm7: Type: Secret (a volume populated by a Secret) SecretName: default-token-gwhm7 Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. Warning FailedScheduling 16s (x29 over 34m) default-scheduler 0/6 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. From above results, non-zoned pvc doesn't match NodeSelectorTerms on OCP 4.4, it works fine on OCP 4.5 and 4.6. - Creating one sc and pvc on zoned region, Since one default zoned sc already existed, no need new one. $ oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE managed-premium (default) kubernetes.io/azure-disk Delete WaitForFirstConsumer true 1h $ cat pvc-zoned.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: azure-managed-disk spec: accessModes: - ReadWriteOnce storageClassName: managed-premium resources: requests: storage: 5Gi $ oc apply -f pvc-zoned.yaml persistentvolumeclaim/azure-managed-disk created $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE azure-managed-disk Bound pvc-9415a7e6-93e3-4892-9c36-cdd47c16fe02 5Gi RWO managed-premium 58m ... $ cat mypod-zoned.yaml kind: Pod apiVersion: v1 metadata: name: mypod1 spec: containers: - name: mypod1 image: nginx:1.15.5 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 250m memory: 256Mi volumeMounts: - mountPath: "/mnt/azure" name: volume volumes: - name: volume persistentVolumeClaim: claimName: azure-managed-disk $ oc apply -f mypod-zoned.yaml pod/mypod1 created $ oc get pods NAME READY STATUS RESTARTS AGE mypod 0/1 Pending 0 72m mypod1 1/1 Running 0 59m $ oc get pods -A | grep -E 'apiserver|NAME' | grep -vE 'installer|revision|catalog' NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-7d68cd5574-dl49s 1/1 Running 2 14h openshift-apiserver apiserver-6b4776d799-cr5pq 1/1 Running 0 14h openshift-apiserver apiserver-6b4776d799-gwxlw 1/1 Running 0 14h openshift-apiserver apiserver-6b4776d799-jrh8n 1/1 Running 0 14h openshift-kube-apiserver-operator kube-apiserver-operator-7c98b4cd9f-6gnzr 1/1 Running 2 14h openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-0 4/4 Running 3 13h openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-1 4/4 Running 5 13h openshift-kube-apiserver kube-apiserver-kewang24azure41-bhpqr-master-2 4/4 Running 5 13h From above test results, we can see there is no new crashloop occurred, it doesn't matter if creating sc and pvc zoned or non-zoned, the kube-apiservers won't crash, move the bug verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2713