Description of problem: We installed OCP 4.4 to Azure France Central region with IPI. When we wanted to deploy an app using PV, pod got below error: "Failed to bind volumes: pv "pvc-113b4883-ef9e-496f-a527-4ebbe4b0c4e1" node affinity doesn't match node "who-tdjh4-worker-francecentral3-j4zw6": No matching NodeSelectorTerms" PV has following affinity: nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: failure-domain.beta.kubernetes.io/region operator: In values: - francecentral - key: failure-domain.beta.kubernetes.io/zone operator: In values: - francecentral-3 but the node label value is FranceCentral, so they didn't match. $ oc get node who-tdjh4-worker-francecentral3-j4zw6 --show-labels NAME STATUS ROLES AGE VERSION LABELS who-tdjh4-worker-francecentral3-j4zw6 Ready worker 9h v1.17.1+1aa1c48 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_D4s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=FranceCentral,failure-domain.beta.kubernetes.io/zone=francecentral-3,kubernetes.io/arch=amd64,kubernetes.io/hostname=who-tdjh4-worker-francecentral3-j4zw6,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=Standard_D4s_v3,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=FranceCentral,topology.kubernetes.io/zone=francecentral-3 When I overwrite the label, it works fine. Version-Release number of the following components: $ oc version Client Version: openshift-clients-4.4.0-202004250654 Server Version: 4.4.11 How reproducible: Steps to Reproduce: 1. Create a cluster on Azure France Central region 2. Deploy an app requires PV 3. Actual results: Pods are not placed due to not matching NodeSelectorTerms unless node labels overwrite with expected labels Expected results: Pods are placed without changing any node label. Additional info: Please attach logs from ansible-playbook with the -vvv flag
Installer does not handle the tags or annotations that end up on PVs or nodes. So moving to storage team for triaging the issue.
This appears to be fixed in latest/1.18 release but the PRs to bring back the changes verbatim is pretty big, because upstream has introduced a new ARM (azure resource manager) client - https://github.com/kubernetes/kubernetes/pull/86740 The crux of the fix for our purpose is: clientRegion: NormalizeAzureRegion(clientRegion), which seems to lower case region before applying to the node, whereas in OCP-4.4 code, we are handling zone correctly but not region: func (az *Cloud) makeZone(location string, zoneID int) string { return fmt.Sprintf("%s-%d", strings.ToLower(location), zoneID) } I would defer to cloud team about whether we want to "carry" a simple fix or backport those larger PRs that fix this problem. Also setting target to 4.4.z because this problem should be fixed in 4.5 and 4.6.
This seems to be the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1860832 which already has PRs open to resolve it, I believe we can close this as a duplicate
*** This bug has been marked as a duplicate of bug 1860830 ***