Bug 1625240 - Move disk (cold or live) fails with "InvalidParameterException: Invalid parameter: 'initial size="
Summary: Move disk (cold or live) fails with "InvalidParameterException: Invalid param...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.31
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.3.5
: ---
Assignee: Eyal Shenitzky
QA Contact: Avihai
URL:
Whiteboard:
: 1625669 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-04 13:08 UTC by Elad
Modified: 2019-05-16 13:23 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-05-16 12:31:47 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
engine.log (37.84 KB, text/plain)
2018-10-10 08:28 UTC, Simon
no flags Details
vdsm.log (33.33 KB, text/plain)
2018-10-10 09:12 UTC, Simon
no flags Details

Description Elad 2018-09-04 13:08:45 UTC
Description of problem:
Move disk (cold or live), for a specific disk, fails for me, every time with InvalidParameterException.


Version-Release number of selected component (if applicable):

[root@vega13 ~]# rpm -qa |grep vdsm
vdsm-api-4.20.39-1.el7ev.noarch
vdsm-client-4.20.39-1.el7ev.noarch
vdsm-jsonrpc-4.20.39-1.el7ev.noarch
vdsm-yajsonrpc-4.20.39-1.el7ev.noarch
vdsm-python-4.20.39-1.el7ev.noarch
vdsm-4.20.39-1.el7ev.x86_64
vdsm-common-4.20.39-1.el7ev.noarch
vdsm-network-4.20.39-1.el7ev.x86_64
vdsm-hook-vmfex-dev-4.20.39-1.el7ev.noarch
vdsm-http-4.20.39-1.el7ev.noarch
[root@vega13 ~]# rpm -qa |grep ovirt
ovirt-imageio-common-1.4.4-0.el7ev.x86_64
ovirt-vmconsole-host-1.0.5-4.el7ev.noarch
ovirt-vmconsole-1.0.5-4.el7ev.noarch
ovirt-imageio-daemon-1.4.4-0.el7ev.noarch
[root@vega13 ~]# rpm -qa |grep libvirt
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-nwfilter-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-logical-3.9.0-14.el7_5.7.x86_64
libvirt-lock-sanlock-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-network-3.9.0-14.el7_5.7.x86_64
libvirt-client-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-core-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-scsi-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-iscsi-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-config-nwfilter-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-nodedev-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-3.9.0-14.el7_5.7.x86_64
libvirt-python-3.9.0-1.el7.x86_64
libvirt-daemon-driver-qemu-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-disk-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-secret-3.9.0-14.el7_5.7.x86_64
libvirt-libs-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-interface-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-driver-storage-mpath-3.9.0-14.el7_5.7.x86_64
libvirt-daemon-kvm-3.9.0-14.el7_5.7.x86_64
[root@vega13 ~]# rpm -qa |grep sanlock
libvirt-lock-sanlock-3.9.0-14.el7_5.7.x86_64
sanlock-3.6.0-1.el7.x86_64
sanlock-python-3.6.0-1.el7.x86_64
sanlock-lib-3.6.0-1.el7.x86_64
[root@vega13 ~]# rpm -qa |grep qemu
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
qemu-kvm-rhev-2.10.0-21.el7_5.7.x86_64
qemu-img-rhev-2.10.0-21.el7_5.7.x86_64
libvirt-daemon-driver-qemu-3.9.0-14.el7_5.7.x86_64
qemu-kvm-common-rhev-2.10.0-21.el7_5.7.x86_64
qemu-kvm-tools-rhev-2.10.0-21.el7_5.7.x86_64
[root@vega13 ~]# rpm -qa |grep selinux
libselinux-ruby-2.5-12.el7.x86_64
libselinux-2.5-12.el7.x86_64
libselinux-utils-2.5-12.el7.x86_64
selinux-policy-targeted-3.13.1-192.el7_5.6.noarch
selinux-policy-3.13.1-192.el7_5.6.noarch
libselinux-python-2.5-12.el7.x86_64
libselinux-2.5-12.el7.i686
[root@vega13 ~]# uname -a
Linux vega13.qa.lab.tlv.redhat.com 3.10.0-862.11.6.el7.x86_64 #1 SMP Fri Aug 10 16:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@vega13 ~]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.5 (Maipo)"

rhvm-4.2.6.4-0.1.el7ev.noarch


How reproducible:
Every move attempt for a specific disk

Steps to Reproduce:
1. Move disk (live or cold)


Actual results:

2018-09-04 15:10:51,050+0300 INFO  (tasks/6) [storage.Volume] Creating volume a2ba792b-23ff-4373-84b7-85fd5a1af047 (volume:1185)
2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.Volume] The requested initial 78090314752 is bigger than the max size 0 (blockVolume:346)
2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.Volume] Failed to create volume /rhev/data-center/mnt/blockSD/037837c4-44ea-4780-89ca-e6c7561d758d/images/b20deb5a-b087-4281-8891-2b3252cb96f4/a2ba792b-23ff-
4373-84b7-85fd5a1af047: Invalid parameter: 'initial size=152520146' (volume:1209)
2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.Volume] Unexpected error (volume:1249)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1206, in create
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 502, in _create
    size, initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 548, in calculate_volume_alloc_size
    preallocate, capacity, initial_size)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 348, in calculate_volume_alloc_size
    initial_size)
InvalidParameterException: Invalid parameter: 'initial size=152520146'
2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.TaskManager.Task] (Task='962f5b87-c375-4fc8-8632-85b63b2aa3cc') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1943, in createVolume
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 799, in createVolume
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1251, in create
    (volUUID, e))
VolumeCreationError: Error creating a new volume: (u"Volume creation a2ba792b-23ff-4373-84b7-85fd5a1af047 failed: Invalid parameter: 'initial size=152520146'",)
2018-09-04 15:10:51,163+0300 INFO  (tasks/6) [storage.Volume] startCreateVolumeRollback: sdUUID=037837c4-44ea-4780-89ca-e6c7561d758d imgUUID=b20deb5a-b087-4281-8891-2b3252cb96f4 volUUID=a2ba792b-23ff-4373-84b7-8
5fd5a1af047  (volume:1094


Expected results:
Move disk should succeed.


Additional info:

[root@vega13 ~]# lvs -o all |grep a2ba792b-23ff-4373-84b7-85fd5a1af047

  vJdO3U-E7xv-I4Ce-GMIi-LiaF-NrXw-UyPH1c a2ba792b-23ff-4373-84b7-85fd5a1af047 517268b4-e6b5-4890-8266-b0a53932a3b3/a2ba792b-23ff-4373-84b7-85fd5a1af047 /dev/517268b4-e6b5-4890-8266-b0a53932a3b3/a2ba792b-23ff-4373-84b7-85fd5a1af047 /dev/mapper/517268b4--e6b5--4890--8266--b0a53932a3b3-a2ba792b--23ff--4373--84b7--85fd5a1af047        linear     public                                                 inherit                                                                                                               -1  -1   auto  80.00g         36                                                                                                                                                                                                                                                                                                                                                                                                                           IU_b20deb5a-b087-4281-8891-2b3252cb96f4,MD_12,PU_00000000-0000-0000-0000-000000000000                    2016-07-06 12:38:57 +0300                             pluto.scl.lab.tlv.redhat.com                      -1   -1      -1 unknown      unknown              unknown              unknown    unknown                                                                                                                                                                                                                                             unknown         unknown         unknown -wi------- lvm2 4VvgN2-vo3L-Ptlf-FQXs-Lej2-TcJw-1FDjt5 517268b4-e6b5-4890-8266-b0a53932a3b3 wz--n- writeable  extendable                       normal                1023.62g 932.50g                                     128.00m  8189 7460     0     0   1           0  12   0 507 MDT_CLASS=Data,MDT_DESCRIPTION=Openstack_Data_XtremIO,MDT_IOOPTIMEOUTSEC=10,MDT_LEASERETRIES=3,MDT_LEASETIMESEC=60,MDT_LOCKPOLICY=,MDT_LOCKRENEWALINTERVALSEC=5,MDT_LOGBLKSIZE=512,MDT_MASTER_VERSION=1,MDT_PHYBLKSIZE=512,MDT_POOL_DESCRIPTION=Openstack,MDT_POOL_DOMAINS=517268b4-e6b5-4890-8266-b0a53932a3b3:Active,MDT_POOL_SPM_ID=-1,MDT_POOL_SPM_LVER=-1,MDT_POOL_UUID=f95e49eb-fe39-45e7-9867-9a889610bd68,MDT_PV0=pv:3514f0c5a516001dc&44&uuid:WROzGu-CLtH-eKhS-LrIu-V5aJ-BfCm-ouApMd&44&pestart:0&44&pecount:8189&44&mapoffset:0,MDT_ROLE=Master,MDT_SDUUID=517268b4-e6b5-4890-8266-b0a53932a3b3,MDT_TYPE=ISCSI,MDT_VERSION=4,MDT_VGUUID=4VvgN2-vo3L-Ptlf-FQXs-Lej2-TcJw-1FDjt5,MDT__SHA_CKSUM=68bb1883aee7f964a09445ff1c6211601e53b11e,RHAT_storage_domain              2        2   <63.99m   128.00m unmanaged



Logs:
https://drive.google.com/open?id=1z643G21XyGFNzoqN3PbJYWbYTBbwb77N

Comment 1 Tal Nisan 2018-09-05 16:45:10 UTC
Benny, can you have a look please?

Comment 2 Benny Zlotnik 2018-09-07 17:23:26 UTC
The problem appears to be the disk having 0 bytes as it's size. I am not sure how it came to be since it was created over two years ago.
I will check if this issue is reproducible today as well

This probably causes https://bugzilla.redhat.com/show_bug.cgi?id=1625669 as wll

Comment 3 Benny Zlotnik 2018-09-16 08:02:02 UTC
I couldn't reproduce it and I found this bug: https://bugzilla.redhat.com/1434927

As this disk was created before the validation was introduced, I suppose it was possible back then, probably with incorrect params sent via REST

Comment 4 Tal Nisan 2018-09-16 12:32:02 UTC

*** This bug has been marked as a duplicate of bug 1434927 ***

Comment 5 Tal Nisan 2018-09-16 12:32:54 UTC
*** Bug 1625669 has been marked as a duplicate of this bug. ***

Comment 6 Elad 2018-09-16 12:35:43 UTC
I disagree with closing this bug. The fact that the disk was created before the validation was introduced doesn't mean we can ignore such cases.
What should a customer do in case he encounters it? Delete the disk? This is not a good approach.

Comment 7 Tal Nisan 2018-09-16 13:00:26 UTC
We have taken care of the root cause that can lead to such  an issue, it's a corner case that can only be encountered when sending bad parameters via REST and was taken care of quite quick after the 4.1 GA so there are slim chances that a customer will hit that.
In a case where a customer will we can create a KB that contains steps to rectify the situation

Comment 8 Simon 2018-10-09 14:40:24 UTC
I seem to have hit this same problem today when trying to move a disk on oVirt 4.2.

ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM blade15.crt.lan command HSMGetAllTasksStatusesVDS failed: Error creating a new volume: (u"Volume creation e6171aae-2c5b-4c91-84fc-506c0e835928 failed: Invalid parameter: 'initial size=122016117'",)

Is there a way to rectify this problem?

Comment 9 Elad 2018-10-10 07:04:56 UTC
Hi Simon, can you please attach the logs?

Comment 10 Simon 2018-10-10 08:28:09 UTC
Created attachment 1492417 [details]
engine.log

Comment 11 Nir Soffer 2018-10-10 08:30:05 UTC
(In reply to Benny Zlotnik from comment #2)
> The problem appears to be the disk having 0 bytes as it's size.

This is probably result of failed resize. Eyal is working on similar bug.

Comment 12 Nir Soffer 2018-10-10 08:33:30 UTC
(In reply to Benny Zlotnik from comment #3)
> I couldn't reproduce it and I found this bug:
> https://bugzilla.redhat.com/1434927
> 
> As this disk was created before the validation was introduced, I suppose it
> was possible back then, probably with incorrect params sent via REST

No, this is not the same issue, and not a duplicate. But this may be a duplicate of Eyals bug about setting disk size to zero before resize. If resize fail, the disk size remaun zero.

Comment 13 Simon 2018-10-10 09:12:55 UTC
Created attachment 1492431 [details]
vdsm.log

Comment 14 Eyal Shenitzky 2018-10-10 11:00:10 UTC
The failing disk is VirtIO-SCSI on a ISCSI storage domain. Thin provisioned.
There are a number of snapshots for the disk as well.

Comment 15 Elad 2019-03-12 13:25:41 UTC
This bug was closed as DUP of a bug that was fixed in 4.1.2.
Also, it was seen by users in 4.2 (see comment #8).
Hence, re-opening.

Comment 16 Nir Soffer 2019-03-12 16:05:49 UTC
Elad, please explain how to reproduce this.

The error:

    2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.Volume] The requested initial 78090314752 is bigger than the max size 0 (blockVolume:346)
    
Say that the disk size is 0. How do you create disk with size of zero bytes?

How are you going to use disk with size of zero bytes? I think the only thing
we can do with such disk is delete it, as it cannot contain any data.

Comment 17 Elad 2019-03-14 09:03:00 UTC
This was seen for disk migration, also by Simon (see comment #13). 
Apart from that, I don't have any more information on how to reproduce.

Comment 18 Eyal Shenitzky 2019-05-16 11:58:05 UTC
It seems like there might be a miss(In reply to Nir Soffer from comment #16)
> Elad, please explain how to reproduce this.
> 
> The error:
> 
>     2018-09-04 15:10:51,101+0300 ERROR (tasks/6) [storage.Volume] The
> requested initial 78090314752 is bigger than the max size 0 (blockVolume:346)
>     
> Say that the disk size is 0. How do you create disk with size of zero bytes?

I think that you confused with another bug, this error doesn't show in the attached VDSM log.

> 
> How are you going to use disk with size of zero bytes? I think the only thing
> we can do with such disk is delete it, as it cannot contain any data.


Is this bug still reproduced?

Comment 19 Elad 2019-05-16 12:01:25 UTC
Not that I'm aware of, Avihai?

Comment 20 Avihai 2019-05-16 12:13:26 UTC
(In reply to Elad from comment #19)
> Not that I'm aware of, Avihai?

AFAIK, We did not encounter this issue lately.
If you do not have enough data to investigate/solve it please close it and if this resurfaces we will reopen it.

Comment 21 Nir Soffer 2019-05-16 13:23:36 UTC
We investigate an issue when engine sends invalid size (0, or < parent size).
The fix discussed is to use the parent size.

Elad, can you share details of the broken image with size=0?

    qemu-img info /dev/vg-name/lv-name

    vdsm-client Volume getInfo storagepoolID=... storagedomainID=... imageID=... volumeID=...


Note You need to log in before you can comment on or make changes to this bug.