Verified with OCP build 4.3.0-0.nightly-2020-06-23-231659, steps see below, Before we verify the bug, checking current status of apiservers as below, kube-apiservers restarted 4 times, this is because 1837992 is not backported to 4.3. $ oc get pods -A | grep -E 'apiserver|NAME' | grep -vE 'installer|revision|catalog' oc get nodes NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-66977c5c67-qmxpd 1/1 Running 1 59m openshift-apiserver apiserver-4l6h9 1/1 Running 0 51m openshift-apiserver apiserver-6tgpf 1/1 Running 0 51m openshift-apiserver apiserver-zcgkp 1/1 Running 0 52m openshift-kube-apiserver-operator kube-apiserver-operator-796f4664b7-2pc48 1/1 Running 1 59m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-0 3/3 Running 4 48m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-1 3/3 Running 4 26m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-2 3/3 Running 4 45m - Creating one sc and pvc on non-zoned region, $ cat sc-non-zoned.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: labels: kubernetes.io/cluster-service: "true" name: managed-premium-nonzoned parameters: kind: Managed storageaccounttype: Premium_LRS zoned: "false" provisioner: kubernetes.io/azure-disk volumeBindingMode: WaitForFirstConsumer $ oc apply -f sc-non-zoned.yaml storageclass.storage.k8s.io/managed-premium-nonzoned created $ oc get sc NAME PROVISIONER AGE managed-premium (default) kubernetes.io/azure-disk 54m managed-premium-nonzoned kubernetes.io/azure-disk 7s $ cat pvc-nonzoned.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: azure-managed-non spec: accessModes: - ReadWriteOnce storageClassName: managed-premium-nonzoned resources: requests: storage: 5Gi $ oc apply -f pvc-non-zoned.yaml persistentvolumeclaim/azure-managed-non created $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE azure-managed-non Pending managed-premium-nonzoned 4s $ cat mypod-non-zoned.yaml kind: Pod apiVersion: v1 metadata: name: mypod spec: containers: - name: mypod image: nginx:1.15.5 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 250m memory: 256Mi volumeMounts: - mountPath: "/mnt/azure" name: volume volumes: - name: volume persistentVolumeClaim: claimName: azure-managed-non $ oc create -f mypod-non-zoned.yaml pod/mypod created Checked the created pod status, $ oc get pods NAME READY STATUS RESTARTS AGE mypod 0/1 Pending 0 7m8s $ oc describe pod/mypod Name: mypod Namespace: default ... Status: Pending ... Conditions: Type Status PodScheduled False Volumes: volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: azure-managed-non ReadOnly: false default-token-tfnhm: Type: Secret (a volume populated by a Secret) SecretName: default-token-tfnhm Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> default-scheduler Failed to bind volumes: pv "pvc-d47bebc8-cfb6-468a-9d21-06123a09621c" node affinity doesn't match node "kewang24azure32-cgf88-worker-westus23-45f5z": No matching NodeSelectorTerms Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 3 node(s) had volume node affinity conflict. From above results, non-zoned pvc doesn't match NodeSelectorTerms on OCP 4.3, it works fine on OCP 4.5 and 4.6. - Creating one sc and pvc on zoned region, Since one default zoned sc already existed, no need new one. $ oc get sc NAME PROVISIONER AGE managed-premium (default) kubernetes.io/azure-disk 56m ... $ cat pvc-zoned.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: azure-managed-disk spec: accessModes: - ReadWriteOnce storageClassName: managed-premium resources: requests: storage: 5Gi $ oc apply -f pvc-zoned.yaml persistentvolumeclaim/azure-managed-disk created $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE azure-managed-disk Pending managed-premium 5s ... $ cat mypod-zoned.yaml kind: Pod apiVersion: v1 metadata: name: mypod1 spec: containers: - name: mypod1 image: nginx:1.15.5 resources: requests: cpu: 100m memory: 128Mi limits: cpu: 250m memory: 256Mi volumeMounts: - mountPath: "/mnt/azure" name: volume volumes: - name: volume persistentVolumeClaim: claimName: azure-managed-disk $ oc apply -f mypod-zoned.yaml pod/mypod1 created $ oc get pods NAME READY STATUS RESTARTS AGE mypod 0/1 Pending 0 7m8s mypod1 1/1 Running 0 3m56s $ oc get pods -A | grep -E 'apiserver|NAME' | grep -vE 'installer|revision|catalog' NAMESPACE NAME READY STATUS RESTARTS AGE openshift-apiserver-operator openshift-apiserver-operator-66977c5c67-qmxpd 1/1 Running 1 143m openshift-apiserver apiserver-4l6h9 1/1 Running 0 135m openshift-apiserver apiserver-6tgpf 1/1 Running 0 135m openshift-apiserver apiserver-zcgkp 1/1 Running 0 136m openshift-kube-apiserver-operator kube-apiserver-operator-796f4664b7-2pc48 1/1 Running 1 143m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-0 3/3 Running 4 132m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-1 3/3 Running 4 110m openshift-kube-apiserver kube-apiserver-kewang24azure32-cgf88-master-2 3/3 Running 4 129m From above test results, we can see there is no new crashloop occurred, it doesn't matter if creating sc and pvc zoned or non-zoned, the kube-apiservers won't crash, move the bug verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2628