Bug 1699820

Summary: StatefulSet tests are failing with vSphere plugin
Product: OpenShift Container Platform Reporter: Vadim Rutkovsky <vrutkovs>
Component: StorageAssignee: Bradley Childs <bchilds>
Status: CLOSED ERRATA QA Contact: Wenqi He <wehe>
Severity: unspecified Docs Contact:
Priority: high    
Version: 4.1.0CC: aos-bugs, aos-storage-staff, bchilds, dphillip, eparis, fbertina, jsafrane
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:47:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vadim Rutkovsky 2019-04-15 10:04:44 UTC
Description of problem:

StatefulSet tests create PVCs without specifying size units 

Version-Release number of selected component (if applicable):

Latest 4.1

How reproducible:
Always

Steps to Reproduce:
1. See https://github.com/openshift/installer/tree/master/upi/vsphere
2. Run OpenShift conformance suite

Actual results:

The following tests fail:
[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should adopt matching orphans and release non-matching pods [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should not deadlock when a pod's predecessor fails [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should perform rolling updates and roll backs of template modifications with PVCs [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should provide basic identity [Suite:openshift/conformance/parallel] [Suite:k8s]

These tests are creating PVC with:
 
spec:
  accessModes:
  - ReadWriteOnce
  dataSource: null
  resources:
    requests:
      storage: "1"

This confuses vSphere plugin, as it expects

  resources:
    requests:
      storage: "1GiB"

instead

Comment 1 Jan Safranek 2019-04-15 11:11:50 UTC
From controller-manager logs:

E0415 09:42:38.175328       1 vsphere.go:1235] Failed to create a vsphere volume with volumeOptions: &{CapacityKB:1 Tags:map[kubernetes.io/created-for/pvc/namespace:e2e-tests-statefulset-tccjc kubernetes.io/created-for/pvc/name:datadir-ss-0 kubernetes.io/created-for/pv/name:pvc-a382b675-5f62-11e9-a278-0050569be521] Name:kubernetes-dynamic-pvc-a382b675-5f62-11e9-a278-0050569be521 DiskFormat:thin Datastore:nvme-ds1 VSANStorageProfileData: StoragePolicyName: StoragePolicyID: SCSIControllerType:lsiLogic} on datastore: nvme-ds1. err: A specified parameter was not correct: 
I0415 09:42:38.175408       1 pv_controller.go:1511] failed to provision volume for claim "e2e-tests-statefulset-tccjc/datadir-ss-0" with StorageClass "thin": A specified parameter was not correct: 


On the other hand, this one succeeded:
I0415 09:31:24.084023       1 vsphere.go:1149] Starting to create a vSphere volume with volumeOptions: &{CapacityKB:1048576 Tags:map[kubernetes.io/created-for/pv/name:pvc-3c5ad65b-5f61-11e9-b12c-0050569bb200 kubernetes.io/created-for/pvc/namespace:pvc-test kubernetes.io/created-for/pvc/name:foo] Name:kubernetes-dynamic-pvc-3c5ad65b-5f61-11e9-b12c-0050569bb200 DiskFormat:thin Datastore: VSANStorageProfileData: StoragePolicyName: StoragePolicyID: SCSIControllerType:}
I0415 09:31:24.225370       1 vsphere_volume_util.go:157] Successfully created vsphere volume kubernetes-dynamic-pvc-3c5ad65b-5f61-11e9-b12c-0050569bb200


Size seems to be the root cause.

Comment 2 Jan Safranek 2019-04-15 11:26:12 UTC
Upstream issue: https://github.com/kubernetes/kubernetes/issues/43319
Closed without a patch.

Comment 3 Eric Paris 2019-04-15 12:17:42 UTC
Can we make this bz about the error message? Fin a way to state specifically which parameter was incorrect? (or just round the 1b up to 1G like supposedly other things do?)

Comment 4 Fabio Bertinatto 2019-04-15 15:15:10 UTC
Storage plugins treat numbers-only storage size as bytes, however, they round it up to the nearest allocation unit (which that may vary from provider to provider).

The exception here is Vsphere that simply converts the value to KiB. For the sake of consistency, I think we should do the same with Vsphere.

Comment 9 Fabio Bertinatto 2019-04-24 12:33:20 UTC
PRs submited and waiting for review:

Upstream: https://github.com/kubernetes/kubernetes/pull/76719
OpenShift https://github.com/openshift/origin/pull/22580

Comment 10 Fabio Bertinatto 2019-04-25 07:10:47 UTC
Both PRs were merged.

Comment 12 Wenqi He 2019-05-05 06:32:35 UTC
Tested on below version:

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-04-210601   True        False         3h59m   Cluster version is 4.1.0-0.nightly-2019-05-04-210601

Created a pvc without unit:
$ cat pvc.json 
{
    "apiVersion": "v1",
    "kind": "PersistentVolumeClaim",
    "metadata": {
        "name": "pvc"
    },
    "spec": {
        "accessModes": [ "ReadWriteOnce" ],
        "resources": {
            "requests": {
                "storage": "1"
            }
        }
    }
}

$ oc get pvc
NAME   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc    Bound    pvc-ceb05d1a-6efe-11e9-b2ad-0050568bed6f   1Mi        RWO            thin           2m36s

PVC bound with the nearest allocation unit of "Mi"

Comment 14 errata-xmlrpc 2019-06-04 10:47:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758