1642302 – Cannot create vm disk with error "not enough space on the cluster" due to incorrect PV size provided by heketi/provisioner

Bug 1642302 - Cannot create vm disk with error "not enough space on the cluster" due to incorrect PV size provided by heketi/provisioner

Summary: Cannot create vm disk with error "not enough space on the cluster" due to in...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Virtualization
Sub Component:
Version:	1.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	1.3
Assignee:	Marc Sluiter
QA Contact:	Kedar Bidarkar
Docs Contact:
URL:
Whiteboard:
Depends On:	1649991
Blocks:
TreeView+	depends on / blocked

Reported:	2018-10-24 07:16 UTC by Guohua Ouyang
Modified:	2019-01-08 14:27 UTC (History)
CC List:	14 users (show)
Fixed In Version:	kubevirt-0.9.6-5.gf09ec38.4badea0 virt-launcher-container-v1.3.0-16
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-01-08 14:27:01 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Created PVC with CDI annotation (1.51 KB, text/plain) 2018-11-12 13:34 UTC, Kedar Bidarkar	no flags	Details
Create VM with PVC-CDI (2.48 KB, text/plain) 2018-11-12 13:36 UTC, Kedar Bidarkar	no flags	Details
oc decribe vmi vmname shows syncfailed (2.91 KB, text/plain) 2018-11-12 13:37 UTC, Kedar Bidarkar	no flags	Details
Created PV/PVC with local storage hdd storageclassName (847 bytes, text/plain) 2018-11-15 05:35 UTC, Kedar Bidarkar	no flags	Details
Create VMI with PV/PV local storage hdd storageclassName (1.05 KB, text/plain) 2018-11-15 05:36 UTC, Kedar Bidarkar	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	heketi heketi issues 1424	0	None	closed	Should heketi take `storage.reserve` into account when creating a volume?	2020-10-26 15:28:24 UTC

Description Guohua Ouyang 2018-10-24 07:16:10 UTC

Description of problem:
Create a VM and start it got below error even it has enough space on the node:

{"component":"virt-launcher","kind":"","level":"error","msg":"pre start setup for VirtualMachineInstance failed.","name":"rhel7","namespace":"default","pos":"manager.go:418","reason":"Unable to create /var/run/kubevirt-private/vmi-disks/disk0-pvc/disk.img with size 10Gi - not enough space on the cluster","timestamp":"2018-10-24T07:03:05.362669Z","uid":"b92a559c-d75a-11e8-a4cd-fa163e0953ea"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to sync vmi","name":"rhel7","namespace":"default","pos":"server.go:115","reason":"Unable to create /var/run/kubevirt-private/vmi-disks/disk0-pvc/disk.img with size 10Gi - not enough space on the cluster","timestamp":"2018-10-24T07:03:05.362730Z","uid":"b92a559c-d75a-11e8-a4cd-fa163e0953ea"}


$ oc get vmi rhel7 --template {{.status.nodeName}}
cnv-executor-gouyang-node2.example.com

[cloud-user@cnv-executor-gouyang-node2 ~]$ hostname
cnv-executor-gouyang-node2.example.com
[cloud-user@cnv-executor-gouyang-node2 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        40G  9.7G   31G  25% /
devtmpfs        3.8G     0  3.8G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G   12M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vdc5       6.4G   30M  6.0G   1% /mnt/local-storage/hdd/disk1
/dev/vdc6       6.4G   30M  6.0G   1% /mnt/local-storage/hdd/disk2
/dev/vdc7       6.6G   31M  6.2G   1% /mnt/local-storage/hdd/disk3
tmpfs           783M     0  783M   0% /run/user/1000
tmpfs           783M     0  783M   0% /run/user/0


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. create a pv
2. create a vm with a pvc using the pv
3. start the vm

Actual results:
Failed to start the VM, It has error in VM's pod log, "Unable to create /var/run/kubevirt-private/vmi-disks/disk0-pvc/disk.img with size 10Gi - not enough space on the cluster".

Expected results:
VM is started and in running.

Additional info:
    $ cat pv0001.yaml
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pv0001
    spec:
      capacity:
        storage: 20Gi
      accessModes:
      - ReadWriteOnce
      nfs:
        path: /opt/data1
        server: 10.66.137.157
      persistentVolumeReclaimPolicy: Recycle
     
    $ oc create -f pv0001.yaml
     
    $ $ cat vm-template-rhel7.yaml
    apiVersion: v1
    kind: Template
    metadata:
      annotations:
        description: OCP KubeVirt Red Hat Enterprise Linux 7.4 VM template
        iconClass: icon-rhel
        tags: kubevirt,ocp,template,linux,virtualmachine
      creationTimestamp: null
      labels:
        kubevirt.io/os: rhel-7.4
        miq.github.io/kubevirt-is-vm-template: "true"
      name: vm-template-rhel7
    objects:
    - apiVersion: kubevirt.io/v1alpha2
      kind: VirtualMachine
      metadata:
        creationTimestamp: null
        labels:
          kubevirt-vm: vm-${NAME}
          kubevirt.io/os: rhel-7.4
        name: ${NAME}
      spec:
        running: false
        template:
          metadata:
            creationTimestamp: null
            labels:
              kubevirt-vm: vm-${NAME}
              kubevirt.io/os: rhel-7.4
          spec:
            domain:
              cpu:
                cores: ${{CPU_CORES}}
              devices:
                disks:
                - disk:
                    bus: virtio
                  name: disk0
                  volumeName: disk0-pvc
              machine:
                type: ""
              resources:
                requests:
                  memory: ${MEMORY}
            terminationGracePeriodSeconds: 0
            volumes:
            - name: disk0-pvc
              persistentVolumeClaim:
                claimName: linux-vm-pvc-${NAME}
      status: {}
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        creationTimestamp: null
        name: linux-vm-pvc-${NAME}
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
      status: {}
      volumeName: pv0001
    parameters:
    - description: Name for the new VM
      name: NAME
    - description: Amount of memory
      name: MEMORY
      value: 4096Mi
    - description: Amount of cores
      name: CPU_CORES
      value: "4"
     
    $ oc process -f vm-template-rhel7.yaml -p NAME=rhel7 | oc create -f -
     
    $ virtctl start rhel7

Comment 1 Qixuan Wang 2018-10-24 07:22:02 UTC

Did you check glusterfs size? Such as
# oc rsh <heketi-storage-pod>
sh-4.2# heketi-cli --user='admin' --secret='<HEKETI_ADMIN_KEY>' node info

Comment 2 Guohua Ouyang 2018-10-24 08:02:41 UTC

sh-4.2#  heketi-cli --user='admin' --secret='47K+6RiJrFIB+Hf4nAuU8Xh5agOVw1Sn9ExD3ha2H1U=' node info bac578259e9fb72dec72deffd72469b7
Node Id: bac578259e9fb72dec72deffd72469b7
State: online
Cluster Id: 1de7b390135383f17bf910d75652736a
Zone: 1
Management Hostname: cnv-executor-gouyang-node2.example.com
Storage Hostname: 172.16.0.16
Devices:
Id:e8190f4d8839f3c385605826fbf09675   Name:/dev/vdb            State:online    Size (GiB):24      Used (GiB):13      Free (GiB):11      Bricks:3

Comment 3 Fabian Deutsch 2018-11-05 14:36:28 UTC

Moving to 1.4, as it's nto nice, but not affecting any primary flow.

Comment 6 Tomas Jelinek 2018-11-07 11:52:29 UTC

Since the CDI will support blank disks only since 1.4, the UI expected the PVC to initialize the empty PVC to an empty disk. It was needed in case the VM was provisioned using PXE boot to have an empty disk to install the system to.

With this bug around the story is much more complicated though. It is not enough just to create an empty PVC and attach it to the VM, it is needed to create an empty disk, serve it somewhere and than use the "cdi.kubevirt.io/storage.import.endpoint: ... " annotation to init the PVC to it.

Long story short, I think this bug should be fixed in 1.3 to simplify the PXE boot story. In 1.4 it is much less urgent since the CDI should support it and the UI will use that one.

@Federico: what do you think?

Comment 7 Fabian Deutsch 2018-11-07 13:45:11 UTC

Side note: We don't have a fix right now.

Comment 8 Marcin Franczyk 2018-11-07 13:54:27 UTC

This part is responsible for free space calculation:
https://github.com/kubevirt/kubevirt/blob/master/pkg/host-disk/host-disk.go#L93
will check that against gluster when I have some free time.

Comment 11 Fabian Deutsch 2018-11-12 13:24:39 UTC

Raising to urgent as it was mentioned this affects initialized PVCs as well.

Comment 13 Marcin Franczyk 2018-11-12 13:29:25 UTC

@gouyang could you please run VMI again as you did before and then try to login into virt-launcher container to check /var/run/kubevirt-private?

oc process -f vm-template-rhel7.yaml -p NAME=rhel7 | oc create -f -
virtctl start rhel7
oc get pods
(find virt-launcher container for rhel7)

oc exec -it <virt-launcher-container> -c compute bash
df -h /var/run/kubevirt-private

Comment 14 Kedar Bidarkar 2018-11-12 13:34:46 UTC

Created attachment 1504727 [details]
Created PVC with CDI annotation

PVC created successfully

Comment 15 Kedar Bidarkar 2018-11-12 13:36:03 UTC

Created attachment 1504730 [details]
Create VM with PVC-CDI

Created VM with PVC ( which used CDI annotation )

Comment 16 Kedar Bidarkar 2018-11-12 13:37:10 UTC

Created attachment 1504731 [details]
oc decribe vmi vmname shows syncfailed

 Warning  SyncFailed          1h (x16 over 1h)  virt-handler, cnv-executor-kbidarka-node1.example.com  server error. command Launcher.Sync failed: Unable to create /var/run/kubevirt-private/vmi-disks/pvc-5648ede2-e45a-11e8-bfba-fa163e46059f/disk.img with size 4Gi - not enough space on the cluster
  Warning  Stopped             1h                virt-handler, cnv-executor-kbidarka-node1.example.com  The VirtualMachineInstance crashed.



The VirtualMachineInstance crashed.

Comment 17 Fabian Deutsch 2018-11-12 13:46:45 UTC

Kedar, what is the content of the PV?

Comment 20 Marcin Franczyk 2018-11-13 09:01:09 UTC

According to a discussion with Kedar, we require exact space to create a disk image. Provider provides incorrectly sized PVs.

As an example output from df on virt-launcher (glusterfs mount point):

df -P /run/kubevirt-private/vmi-disks/pvc-5648ede2-e45a-11e8-bfba-fa163e46059f
Filesystem                                        Size  Used Avail Use%
172.16.0.17:vol_e8aecbd1cafacd6e41e7f64746cc11c2  4.0G   74M  4.0G   2% 

it shows already used 74M which means there is no required 4Gi space for a disk image there is only 3.926 Gi, so the message "there is not enough space..." is correct, user required a 4Gi image but there is no space to create one.

The 74M used space might be related to glusterfs reserved space by default
https://docs.gluster.org/en/v3/release-notes/3.13.0/#ability-to-reserve-back-end-storage-space

Comment 21 Roman Mohr 2018-11-13 09:12:50 UTC

Let me also add here that as a cloud-user I would be very surprised if I would ask for 4 GB of storage and only get less. 

In this case, if for some internal reasons gluster needs some space left free, it could give the user 4074M and then "reserve" 74M again.
The same if e.g. a storage provider uses ext3/4 where some disk space may be reserved for root only. It is not helpful for an application if e.g. 5 or 10% of the space are then reserved for root.

Just to keep in mind that this is the base-assumption about storage providers.

Comment 22 Fabian Deutsch 2018-11-13 09:29:37 UTC

Thanks for the quick investigation.

Note from Kubernetes docs:

"Otherwise, the user will always get at least what they asked for, but the volume may be in excess of what was requested."

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#binding

So yes, this is then actually not a bug on our side, but a bug on the provisioner.

CDI populates the storage, but it seems it was provided by gluster. Thus I consider this to be a bug in the glsuter provisioner until we have no additional infos.

Niels, thoughts?

Comment 23 Marcin Franczyk 2018-11-13 09:33:52 UTC

I created a PR which improves an error message, it shows the requested and available size.

https://github.com/kubevirt/kubevirt/pull/1693

Comment 24 Niels de Vos 2018-11-13 11:21:26 UTC

(In reply to Marcin Franczyk from comment #20)
> According to a discussion with Kedar, we require exact space to create a
> disk image. Provider provides incorrectly sized PVs.
> 
> As an example output from df on virt-launcher (glusterfs mount point):
> 
> df -P
> /run/kubevirt-private/vmi-disks/pvc-5648ede2-e45a-11e8-bfba-fa163e46059f
> Filesystem                                        Size  Used Avail Use%
> 172.16.0.17:vol_e8aecbd1cafacd6e41e7f64746cc11c2  4.0G   74M  4.0G   2% 
> 
> it shows already used 74M which means there is no required 4Gi space for a
> disk image there is only 3.926 Gi, so the message "there is not enough
> space..." is correct, user required a 4Gi image but there is no space to
> create one.
> 
> The 74M used space might be related to glusterfs reserved space by default
> https://docs.gluster.org/en/v3/release-notes/3.13.0/#ability-to-reserve-back-
> end-storage-space

This is a good point. Possibly the `storage.reserve` option was not available in earlier versions (are you using RHGS or upstream Gluster?).

The option can be disabled in the StorageClass. See the example at https://github.com/kubernetes-incubator/external-storage/blob/master/gluster/file/examples/class.yaml for the `volumeoptions` parameter. Note that this is an external provisioner, and you are most likely using the kubernetes embedded one.

In any case, you can set the `volumeoptions: "storage.reserve 0"` to disable the additional allocation. This is safe for the non-distribute volumes that are normally used in combination with kubevirt.

Can you let me know if that helps? Otherwise we will need to account for more space when we allocate the Gluster volume through heketi.

Comment 25 Humble Chirammal 2018-11-13 11:22:32 UTC

(In reply to Marcin Franczyk from comment #20)
> According to a discussion with Kedar, we require exact space to create a
> disk image. Provider provides incorrectly sized PVs.
> 
> As an example output from df on virt-launcher (glusterfs mount point):
> 
> df -P
> /run/kubevirt-private/vmi-disks/pvc-5648ede2-e45a-11e8-bfba-fa163e46059f
> Filesystem                                        Size  Used Avail Use%
> 172.16.0.17:vol_e8aecbd1cafacd6e41e7f64746cc11c2  4.0G   74M  4.0G   2% 
> 
> it shows already used 74M which means there is no required 4Gi space for a
> disk image there is only 3.926 Gi, so the message "there is not enough
> space..." is correct, user required a 4Gi image but there is no space to
> create one.
> 
> The 74M used space might be related to glusterfs reserved space by default
> https://docs.gluster.org/en/v3/release-notes/3.13.0/#ability-to-reserve-back-
> end-storage-space

I agree that, we could reserve some extra space in the backend to provide 'completely usable space' == 'requested space' . The provisioner pass the requested space as it is to heketi, from there on its in heketi's control.

Niels, please feel free to open an issue at heketi.

As an additional thought, the reservation from LVM/FS layer is kind of expected in general scenario. So it can be accounted by the admin/user of app pod as well. but, how much is still a question. Even if gluster reservation is taken care by heketi, we wont be in a better position to clearly say what LVM or FS reservation is.

Comment 26 Niels de Vos 2018-11-13 11:33:35 UTC

IIRC Heketi already takes additional required space into account, at least for LVM-metadata and possibly XFS. The relatively new `storage.reserve` might not be included in the overhead yet.

Comment 27 Marcin Franczyk 2018-11-13 11:53:11 UTC

(In reply to Niels de Vos from comment #24)
> (In reply to Marcin Franczyk from comment #20)
> > According to a discussion with Kedar, we require exact space to create a
> > disk image. Provider provides incorrectly sized PVs.
> > 
> > As an example output from df on virt-launcher (glusterfs mount point):
> > 
> > df -P
> > /run/kubevirt-private/vmi-disks/pvc-5648ede2-e45a-11e8-bfba-fa163e46059f
> > Filesystem                                        Size  Used Avail Use%
> > 172.16.0.17:vol_e8aecbd1cafacd6e41e7f64746cc11c2  4.0G   74M  4.0G   2% 
> > 
> > it shows already used 74M which means there is no required 4Gi space for a
> > disk image there is only 3.926 Gi, so the message "there is not enough
> > space..." is correct, user required a 4Gi image but there is no space to
> > create one.
> > 
> > The 74M used space might be related to glusterfs reserved space by default
> > https://docs.gluster.org/en/v3/release-notes/3.13.0/#ability-to-reserve-back-
> > end-storage-space
> 
> This is a good point. Possibly the `storage.reserve` option was not
> available in earlier versions (are you using RHGS or upstream Gluster?).
> 
> The option can be disabled in the StorageClass. See the example at
> https://github.com/kubernetes-incubator/external-storage/blob/master/gluster/
> file/examples/class.yaml for the `volumeoptions` parameter. Note that this
> is an external provisioner, and you are most likely using the kubernetes
> embedded one.
> 
> In any case, you can set the `volumeoptions: "storage.reserve 0"` to disable
> the additional allocation. This is safe for the non-distribute volumes that
> are normally used in combination with kubevirt.
> 
> Can you let me know if that helps? Otherwise we will need to account for
> more space when we allocate the Gluster volume through heketi.

That's the question to Guohua Ouyang and Kedar Bidarkar, I am not aware of their gluster config and version. Guys can you check that?

Comment 28 Fabian Deutsch 2018-11-13 13:05:20 UTC

(In reply to Humble Chirammal from comment #25)
…

> 
> As an additional thought, the reservation from LVM/FS layer is kind of
> expected in general scenario. So it can be accounted by the admin/user of
> app pod as well. but, how much is still a question. Even if gluster
> reservation is taken care by heketi, we wont be in a better position to
> clearly say what LVM or FS reservation is.

I understand this in the technical context, but from the users POV (and that is what Kube is concerned about) the request should be _at least_ what you get.
The technical overhead is somethign to account for indeed, but in that case it will be always in the storage providers concern to adjust the storage size as needed.

Again, the PVC request guarantees that a user will get the requested size or larger - that is what counts.

Comment 29 Fabian Deutsch 2018-11-14 09:32:12 UTC

Kedar, can you please open a new bug on CNS (PV size is not at least the size of the PVC request).

Let's use this bug to track this issue in the release notes.

Comment 30 Fabian Deutsch 2018-11-14 09:43:02 UTC

Kedar, please also report if this is also happening with local storage (https://kubernetes.io/docs/concepts/storage/volumes/#local)

Comment 31 Kedar Bidarkar 2018-11-15 05:12:46 UTC

Filed a bug on Gluster Storage, Component: Heketi and OCS-3.11 here,

https://bugzilla.redhat.com/show_bug.cgi?id=1649991

Comment 32 Kedar Bidarkar 2018-11-15 05:25:46 UTC

Tried creating PV/PVC using local storage and then created a VM which uses the PV/PVC.

[cloud-user@cnv-executor-kbidarka-node1 disk1]$ ll -h 
total 20K
-rw-r--r--. 1  107  107 4.0G Nov 14 12:01 disk.img
drwxrwxrwx. 2 root root  16K Nov  9 10:49 lost+found

[root@vmi-fedora28-cloud-nocdi ~]# fdisk -l /dev/vdb
Disk /dev/vdb: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Will be attaching the spec files shortly.

Comment 33 Kedar Bidarkar 2018-11-15 05:35:26 UTC

Created attachment 1505933 [details]
Created PV/PVC with local storage  hdd storageclassName

Comment 34 Kedar Bidarkar 2018-11-15 05:36:52 UTC

Created attachment 1505934 [details]
Create VMI with PV/PV local storage hdd storageclassName

Comment 35 Fabian Deutsch 2018-11-15 09:13:07 UTC

To me this looks as if this is NOTABUG?

Comment 36 Fabian Deutsch 2018-11-15 09:51:10 UTC

Are you saying in comment 32 that disk.img is empty?

Comment 37 Marcin Franczyk 2018-11-15 09:55:21 UTC

Fabian, yes the disk image is empty, it's correct behavior. If PVC doesn't contain any disk image then a sparse file with required size is created.
In my opinion, this bgz is not a bug.

Comment 38 Kedar Bidarkar 2018-11-15 10:18:27 UTC

In comment 32 , what I actually meant was,

1) Created PVC/PV with 4GB size  (4294967296 bytes )
2) Then created VM with the above PV/PVC and the /dev/vdb shows of the size 4294967296 bytes


1) Size of disk.img from the node1 on the OCP cluster

[cloud-user@cnv-executor-kbidarka-node1 disk1]$ ll
total 20
-rw-r--r--. 1  107  107 4294967296 Nov 14 12:01 disk.img


2) Size of /dev/vdb from inside the VM

 [root@vmi-fedora28-cloud-nocdi ~]# fdisk -l /dev/vdb
Disk /dev/vdb: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Comment 39 Roman Mohr 2018-11-15 10:20:32 UTC

It is a sparse image file, so that it is small looks reasonable. Is that what you wanted to highlight?

Comment 40 Roman Mohr 2018-11-15 10:20:32 UTC

It is a sparse image file, so that it is small looks reasonable. Is that what you wanted to highlight?

Comment 42 Niels de Vos 2018-11-15 11:00:38 UTC

Kedar, could you try if the option in the StorageClass that has been mentioned in comment #24 makes a difference?

Thanks!

Comment 43 Fabian Deutsch 2018-11-15 12:52:21 UTC

Moving this bug to storage component to get it validated once the parent bug is getting resolved. Also retargetting it, as it's only for validation.

Changing priority as this bug is not ciritical, but the parent one.

Comment 44 Kedar Bidarkar 2018-11-15 13:24:22 UTC

Ok, below are my findings

1) with gluster volume reserved space set to 0, we see 33MB already being consumed

2) with gluster volume reserved space not set to 0, we see 74MB already being consumed


1) with gluster volume reserved space set to 0, we see 33MB already being consumed

a) set the volume reserved space to 0
sh-4.2# gluster volume set vol_caa9a5c2fb4a634370dc90db6adeb4d9 storage.reserve 0 
volume set: success
sh-4.2# gluster volume list                                  
heketidbstorage
vol_caa9a5c2fb4a634370dc90db6adeb4d9

b) sh-4.2# gluster volume info vol_caa9a5c2fb4a634370dc90db6adeb4d9
 
Volume Name: vol_caa9a5c2fb4a634370dc90db6adeb4d9
Type: Replicate
Volume ID: 887dc46c-f3be-467c-a583-de333a06adf4
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 172.16.0.17:/var/lib/heketi/mounts/vg_2e3f670fb4ddb9df1cf20ff7a111160e/brick_58a4107ea988d6286d7632877d8634e1/brick
Brick2: 172.16.0.25:/var/lib/heketi/mounts/vg_2bad4ab5f6211fbafd04856aa190dd45/brick_4e749b1fdce123fef5d01ad7593ff71f/brick
Brick3: 172.16.0.22:/var/lib/heketi/mounts/vg_5f38518705232c698a91a48169d32bbd/brick_615529e89cfc7576f348ce0f5c94060e/brick
Options Reconfigured:
storage.reserve: 0
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.brick-multiplex: on

c)

kbidarka  ~  manifests_local  oc rsh -c compute virt-launcher-vmi-fedora28-cloud-2lntr


sh-4.2# df -h /var/run/kubevirt-private/vmi-disks/pvc-2cebc7cd-e8c9-11e8-89d8-fa163e46059f/
Filesystem                                        Size  Used Avail Use% Mounted on
172.16.0.17:vol_caa9a5c2fb4a634370dc90db6adeb4d9  4.0G   33M  4.0G   1% /run/kubevirt-private/vmi-disks/pvc-2cebc7cd-e8c9-11e8-89d8-fa163e46059f



 kbidarka  ~  manifests_local  oc describe pv pvc-2cebc7cd-e8c9-11e8-89d8-fa163e46059f | grep -i Path 
    Path:           vol_caa9a5c2fb4a634370dc90db6adeb4d9



2) with gluster volume reserved space not set to 0, we see 74MB already being consumed

  kbidarka  ~  manifests_local  oc rsh -c compute virt-launcher-vmi-fedora28-cloud1-29ph5
sh-4.2# df -h /var/run/kubevirt-private/vmi-disks/pvc-e1efecbc-e8cf-11e8-89d8-fa163e46059f/
Filesystem                                        Size  Used Avail Use% Mounted on
172.16.0.17:vol_975d2eb941dff39ea9d68557c645f874  4.0G   74M  4.0G   2% /run/kubevirt-private/vmi-disks/pvc-e1efecbc-e8cf-11e8-89d8-fa163e46059f


 kbidarka  ~  manifests_local  oc describe pv pvc-e1efecbc-e8cf-11e8-89d8-fa163e46059f | grep -i Path 
    Path:           vol_975d2eb941dff39ea9d68557c645f874


Gluster volume for which reserved space is not set to zero.

sh-4.2# gluster volume info vol_975d2eb941dff39ea9d68557c645f874
 
Volume Name: vol_975d2eb941dff39ea9d68557c645f874
Type: Replicate
Volume ID: 1674fbce-2247-4798-9653-4fa5264b48c3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 172.16.0.22:/var/lib/heketi/mounts/vg_5f38518705232c698a91a48169d32bbd/brick_330fb4df457f93e815f33b690cbde394/brick
Brick2: 172.16.0.17:/var/lib/heketi/mounts/vg_2e3f670fb4ddb9df1cf20ff7a111160e/brick_ac62f9ab768c4ff31cfb795eb2c2484c/brick
Brick3: 172.16.0.25:/var/lib/heketi/mounts/vg_2bad4ab5f6211fbafd04856aa190dd45/brick_7e7a40738b255b64c50e3776fdbbb770/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.brick-multiplex: on

Comment 45 Niels de Vos 2018-11-15 13:47:10 UTC

Thanks for the additional details!

Comment 46 Fabian Deutsch 2018-11-15 15:01:54 UTC

Notes:

1. Workaround: Create an empty gluster PV, manually create the disk.img
2. Drop the empty-pv-initialization feature from release-0.9

Comment 47 Fabian Deutsch 2018-11-15 15:13:04 UTC

Side note, comment 46 is for a short term fix we can deliver fast.

Long term fixes can be:
1. fix heketi to include overhead
2. fix provisioner to include overhead


Something to clarify: What is the epxectation when you regquest a PVC of 4G - Does this mean I can store 4G of data on it. Or I get a $g volume (wmich might be smaller due to overhead).

My take: We need to provide a volume which allows to store the requested amount of data.

Comment 48 Stephen Gordon 2018-11-15 15:18:04 UTC

(In reply to Fabian Deutsch from comment #46)
> Notes:
> 
> 1. Workaround: Create an empty gluster PV, manually create the disk.img
> 2. Drop the empty-pv-initialization feature from release-0.9

I lean towards (1) given this appears to be specific to the choices made by the Gluster provisioner.

Comment 49 Fabian Deutsch 2018-11-19 13:23:05 UTC

Update: The decision is to tolerate this issue on the KubeVirt side.

1. There is a kubevirt-config option pvc-tolerate-less-space-up-to-percent (default 10) - which allows a user to configure to accept if a PVC has less space than requested
2. The disk size will be created up to the available disk capacity
3. An event will be raised whenever requested capacity != available capacity

Comment 54 Kedar Bidarkar 2018-11-23 19:35:25 UTC

Below are my findings 

1) Created a PVC successfully and then created a VM which uses this PVC
2) virt-launcher pod of the VMI shows lot of errors, will attach logs shortly.

Also, I tried creating in total 3 PVC, but only 1 PVC creation was successful, the other 2 PVC Creation failed.

My setup does have the below mentioned components,  as seen from the ImageID's.

virt-api-container-v1.3.0-14 virt-controller-container-v1.3.0-16 virt-handler-container-v1.3.0-15 virt-launcher-container-v1.3.0-16

Comment 58 Kedar Bidarkar 2018-11-23 19:55:47 UTC

comment 57 talks about a different PVC/ importer pod, which was started few minutes after the first PVC creation.

With the logs and screenshot attached in comment 57, want to mention that something weird happened and PVC Creation/importer pod crashed mid way.

The same happened with another PVC Creation/importer pod.

Not sure if all this is related somehow.

----

Summary:-   In total, 3 PVC creation started one after another, 1st PVC succeeded , the later 2 failed with the same error as shown in screenshot comment 57.

Comment 59 Kedar Bidarkar 2018-11-24 10:43:39 UTC

I see the issue on my setup, heketi-storage pod STATUS is "CreateContainerError"


[cloud-user@cnv-executor-kbidarka-master1 ~]$ oc project glusterfs 
Now using project "glusterfs" on server "https://cnv-executor-kbidarka-master1.example.com:8443".
[cloud-user@cnv-executor-kbidarka-master1 ~]$ oc get pods 
NAME                                          READY     STATUS                 RESTARTS   AGE
glusterblock-storage-provisioner-dc-1-zx6hf   1/1       Running                0          23h
glusterfs-storage-hxp7b                       1/1       Running                0          23h
glusterfs-storage-rjp5x                       1/1       Running                0          23h
glusterfs-storage-vzmj8                       1/1       Running                0          23h
heketi-storage-1-r9bfr                        0/1       CreateContainerError   0          23h

Comment 61 Denys Shchedrivyi 2018-11-25 17:08:06 UTC

 I created PVC and VMI with Kedar's yaml files (with a bit difference:  for PVC I increased the size of disk to 5 GB; for VMI I increased CPU cores to 2, Memory to 4Gb) and it works well. I don't see error messages that Kedar has.

 The steps I used:
1) oc create -f pvc_fedora.yml
# oc get pvc
NAME         STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
fedora-cdi   Bound     pvc-aab8349e-f04a-11e8-ac68-fa163e471330   5Gi        RWO            glusterfs-storage   1h

2) wait untill PVC is created and downloaded the image (it took more than 1.5 hour for me):
# oc logs importer-fedora-cdi-bfxg8
I1124 17:23:33.618601       1 importer.go:32] Starting importer
I1124 17:23:33.619043       1 importer.go:37] begin import process
I1124 17:23:33.619065       1 dataStream.go:222] copying "https://download.fedoraproject.org/pub/fedora/linux/releases/28/Cloud/x86_64/images/Fedora-Cloud-Base-28-1.1.x86_64.qcow2" to "/data/disk.img"...
...
I1124 17:23:34.262658       1 prlimit.go:98]     (0.00/100%)
...
I1124 18:52:02.665703       1 prlimit.go:98]     (98.67/100%)


3) After downloading image POD importer-fedora-cdi-bfxg8 disappeared and I created vmi:
oc create -f vmi-fedora.yml

4) wait a bit for starting VM and successfully connect to console. 

Just one thing not so clear for me - why the disk in Fedora has size 4Gb if I created PVC with 5 Gb:

$ virtctl console vmi-fedora28-cdi
Successfully connected to vmi-fedora28-cdi console. The escape sequence is ^]

[root@vmi-fedora28-cdi ~]# fdisk -l 
Disk /dev/vda: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x770fc65d

Device     Boot Start     End Sectors Size Id Type
/dev/vda1  *     2048 8388607 8386560   4G 83 Linux

Comment 62 Roman Mohr 2018-11-26 07:52:36 UTC

I just want to highlight that this report here is about adding an empty PVC directly to a VMI without CDI. KubeVirt should crate a sparse and empty disk on that PVC with (now) roughly the size of the PVC. Maybe it is clear for everyone, but  there is a lot of talk going on about CDI related things.

Comment 63 Fabian Deutsch 2018-11-26 08:07:09 UTC

The bug as described in comment 49 is now fixed (see comment 61).

Comment 65 Kedar Bidarkar 2018-11-26 09:51:16 UTC

Agreed, this is not related to CDI, had my spec/manifest files with CDI. 
But, using CDI for some reason renders the entire CNV setup useless and may be altogether anoter bug and issue.

I just tested without CDI and this appears to be fixed and in sync with what Denys sees.

OC describe VMI VMI-NAME
-------------------------

Status:
  Interfaces:
    Ip Address:  <IP address>
  Node Name:     cnv-executor-kbidarka-node2.example.com
  Phase:         Running
Events:
  Type    Reason              Age                From                                                   Message
  ----    ------              ----               ----                                                   -------
  Normal  SuccessfulCreate    1m                 virtualmachine-controller                              Created virtual machine pod virt-launcher-vmi-fedora-cs252
  Normal  SuccessfulHandOver  22s                virtualmachine-controller                              Pod ownership transferred to the node cnv-executor-kbidarka-node2.example.com
  Normal  ToleratedSmallPV    22s                virt-handler, cnv-executor-kbidarka-node2.example.com  PV size too small: expected 10737418240 B, found 10584842240 B. Using it anyway, it is within 10 % toleration
  Normal  Created             21s (x2 over 21s)  virt-handler, cnv-executor-kbidarka-node2.example.com  VirtualMachineInstance defined.
  Normal  Started             21s                virt-handler, cnv-executor-kbidarka-node2.example.com  VirtualMachineInstance started.

Tested with latest CNV-1.3

Comment 66 Marc Sluiter 2018-11-26 09:59:21 UTC

This is exactly what we expect to see now: KubeVirt detects a PV which is smaller than expected, but it doesn't fail anymore, but only reports it:

Normal  ToleratedSmallPV    22s                virt-handler, cnv-executor-kbidarka-node2.example.com  PV size too small: expected 10737418240 B, found 10584842240 B. Using it anyway, it is within 10 % toleration

Comment 67 Roman Mohr 2018-11-26 16:49:28 UTC

Arguably it might be better to only allocate the space which is there then for the disk and not pause ...

Comment 68 Roman Mohr 2018-11-26 17:08:12 UTC

> Arguably it might be better to only allocate the space which is there then for the disk and not pause ...

and that is what we do anyway, so it should not pause. Forget my last comment.

Comment 69 Nelly Credi 2018-11-26 17:56:35 UTC

@Fabian I believe you mentioned it may happen. is that still a possibility?

Comment 70 Fabian Deutsch 2018-11-26 19:55:15 UTC

The case I probably mentioned was to create a sparse file as large as requested, which could be filled up to the available capacity. In that case the disk (from VM perspective) could not be fully filled up, because it is larger than the available capacity.
In such a case the VM would have been paused once no further writes could be performed on the underlying disk (once the capacity was exhausted).

But the noew choosen implementation does not suffer from this behavior.

Comment 71 Nelly Credi 2018-11-27 12:22:13 UTC

cool. removed

Comment 72 Nelly Credi 2018-12-02 12:04:23 UTC

Was able to run E2E flow - 
create a VM on gluster storage class,
start it & access the console

on an env with FIPS disabled, 
OCS 3.11.1

Note You need to log in before you can comment on or make changes to this bug.