Bug 1877681 - Manually created PV can not be used
Summary: Manually created PV can not be used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Tomas Smetana
QA Contact: Qin Ping
URL:
Whiteboard:
Depends On: 1900446
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-10 07:55 UTC by Qin Ping
Modified: 2021-02-24 15:18 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: On the OpenStack platform he admission plugin did always add a failover domain and region labels even if they were not configured properly (empty). This caused issues with statically provisioned persistent volumes. Consequence: Pods using statically provisioned persistent volumes failed to start on OpenStack clusters with empty region in configuration. Fix: The labels are added to the volume only in the case they contain valid region and failure domain, just as in the case of dynamically provisioned persistent volumes. Result: The pods using statically provisioned volumes behave the same as the ones with dynamically provisioned volumes on OpenStack clusters configured with empty region or failure domain.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:17:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubernetes kubernetes pull 95174 0 None closed Don't add empty AZ labels to OpenStack pre-provisioned PVs 2021-01-13 10:29:17 UTC
Github openshift kubernetes pull 440 0 None closed Bug 1877681: UPSTREAM: 95174: Don't add empty AZ labels to OpenStack pre-provisioned PVs 2021-01-13 10:29:16 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:18:17 UTC

Description Qin Ping 2020-09-10 07:55:01 UTC
Description of Problem:
Manually created PV can not be used for the PV is assigned a wrong nodeAffnity.

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-07-224533

How Reproducible:
Always


Steps to Reproduce:
1. Create a PV manually in an OSP16 with kuryr cluster
2. Create a PVC
3. Create a Pod to use this PVC

Actual Results:
Pod can not be scheduled.
  Warning  FailedScheduling  <unknown>        0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict.
  Warning  FailedScheduling  <unknown>        0/6 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had volume node affinity conflict.


Expected Results:
Pod can run succuessfully.

Additional info:
$ cat pv.yaml 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-test
spec:
  capacity:
    storage: 1Gi
  accessModes:
  - ReadWriteOnce
  cinder:
    volumeID: d3c7c54a-bb69-447f-ae45-c5af2033a885
    fsType: ext4
  persistentVolumeReclaimPolicy: Delete
  storageClassName: sc-81qdr

Manually created PV's nodeAffinity:
$ oc get pv pv-test -ojson |jq .spec.nodeAffinity
{
  "required": {
    "nodeSelectorTerms": [
      {
        "matchExpressions": [
          {
            "key": "failure-domain.beta.kubernetes.io/zone",
            "operator": "In",
            "values": [
              "nova"
            ]
          },
          {
            "key": "failure-domain.beta.kubernetes.io/region",
            "operator": "In",
            "values": [
              ""
            ]
          }
        ]
      }
    ]
  }
}

Dynamic provisioned PV's nodeAffinity
$ oc get pv pvc-547ba420-f426-4db4-b336-b8b961e259d1 -ojson|jq .spec.nodeAffinity
{
  "required": {
    "nodeSelectorTerms": [
      {
        "matchExpressions": [
          {
            "key": "failure-domain.beta.kubernetes.io/zone",
            "operator": "In",
            "values": [
              "nova"
            ]
          }
        ]
      }
    ]
  }
}

Node labels:
$ oc get node --show-labels 
NAME                          STATUS   ROLES    AGE   VERSION                LABELS
ostest-rfp79-master-0         Ready    master   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-master-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova
ostest-rfp79-master-1         Ready    master   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-master-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova
ostest-rfp79-master-2         Ready    master   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-master-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova
ostest-rfp79-worker-0-bqr47   Ready    worker   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,efp6i=testfor510646,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-worker-0-bqr47,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova
ostest-rfp79-worker-0-fs2ms   Ready    worker   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-worker-0-fs2ms,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova
ostest-rfp79-worker-0-w64p6   Ready    worker   44h   v1.19.0-rc.2+068702d   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/zone=nova,kubernetes.io/arch=amd64,kubernetes.io/hostname=ostest-rfp79-worker-0-w64p6,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=m4.xlarge,node.openshift.io/os_id=rhcos,topology.kubernetes.io/zone=nova

Comment 3 Tomas Smetana 2020-09-14 10:06:09 UTC
It looks like there's a discrepancy between the dynamic provisioner and the openstack PVLabeler:

While the provisioner doesn't add the labels if they're empty:
https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/cinder/cinder_util.go#L228-L237

The cloud provider's PVLabeller (that's being used by the admission controller for statically provisioned volumes) will always return the zone/region labels even if empty strings:
https://github.com/openshift/origin/blob/master/vendor/k8s.io/legacy-cloud-providers/openstack/openstack_volumes.go#L745-L749

The fix should be trivial... However I need to find a way to test this too.

Comment 4 Tomas Smetana 2020-09-15 14:14:13 UTC
I failed to reproduce this. I thought it should happen with every statically provisioned PV but the PV I created in the PSI openstack has no nodeAffinity set at all.

Comment 5 Tomas Smetana 2020-09-15 15:09:19 UTC
I also compared the PV from must gather and mine. The one from must-gather has the failure-domain.beta.kubernetes.io/<region|zone> set while mine does not (ie. no labels at all). The PVLabeler code is the same so there's something else that's altering the PV object.

Comment 9 Tomas Smetana 2020-09-30 09:12:39 UTC
Upstream PR: https://github.com/kubernetes/kubernetes/pull/95174

Comment 12 Qin Ping 2020-11-25 03:33:40 UTC
Verified with: 4.7.0-0.nightly-2020-11-24-080601

Comment 15 errata-xmlrpc 2021-02-24 15:17:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.