Bug 1711902

Summary: ovirt-engine-4.1.11.2 fails to add disks with vdsm-4.30 hosts and 4.1 compatibility level: InvalidParameterException: Invalid parameter: 'DiskType=2'
Product: Red Hat Enterprise Virtualization Manager Reporter: Juan Orti <jortialc>
Component: vdsmAssignee: shani <sleviim>
Status: CLOSED NOTABUG QA Contact: Avihai <aefrat>
Severity: high Docs Contact:
Priority: high    
Version: 4.1.11CC: abpatil, amashah, aperotti, bugs, derez, gwatson, juzhou, lleistne, lsurette, mmartinv, mxie, nsoffer, pelauter, rdlugyhe, sbonazzo, sleviim, srevivo, tnisan, tzheng, xiaodwan, ycui, zili
Target Milestone: ovirt-4.4.0Keywords: Regression, TestBlocker, ZStream
Target Release: 4.3.5Flags: lsvaty: testing_plan_complete-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.30.22 Doc Type: Bug Fix
Doc Text:
In a Red Hat Virtualization (RHV) environment with VDSM version 4.3 and Manager version 4.1, the DiskTypes are parsed as int values. However, in an RHV environment with Manager version > 4.1, the DiskTypes are parsed as strings. That compatibility mismatch produced an error: "VDSM error: Invalid parameter: 'DiskType=2'". The current release fixes this issue by changing the string value back to an int, so the operation succeeds with no error.
Story Points: ---
Clone Of:
: 1723873 (view as bug list) Environment:
Last Closed: 2019-12-19 06:44:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1541529, 1723873    
Attachments:
Description Flags
engine-v2v-import.log
none
vdsm-v2v-import.log
none
disktype=1.meta none

Comment 1 Juan Orti 2019-05-28 06:12:45 UTC
After downgrading to vdsm-4.20, adding a disk works again.

Comment 2 Tal Nisan 2019-06-10 14:20:21 UTC
Hi Juan,
Is the Engine version 4.3 or 4.1?

Comment 3 Juan Orti 2019-06-11 07:11:27 UTC
Hi,

The engine is version ovirt-engine-4.1.11.2-0.1.el7.noarch

Steps to reproduce:

1. Install hosts with vdsm-4.30.13-4.el7ev.x86_64
2. Use ovirt-engine-4.1.11.2-0.1.el7.noarch
3. Cluster and datacenter to 4.1 compatibility level
4. Add a virtual disk to a VM on a NFS storage domain.

Results:

This error in engine.log and cannot add the disk:

2019-05-20 11:16:40,791+0200 INFO  (jsonrpc/2) [vdsm.api] START createVolume(sdUUID=u'a0814562-cece-4426-aeb0-42f565fc6fcc', spUUID=u'5849b030-626e-47cb-ad90-3ce782d831b3', imgUUID=u'35cbcdf9-1dad-419d-9a32-2e9b9118ae86', size=u'21474836480', volFormat=5, preallocate=2, diskType=2, volUUID=u'75184f63-fa62-4d6b-afa2-d0172773e737', desc=u'{"DiskAlias":"vmname_Disk2","DiskDescription":"test"}', srcImgUUID=u'00000000-0000-0000-0000-000000000000', srcVolUUID=u'00000000-0000-0000-0000-000000000000', initialSize=None) from=::ffff:10.0.0.1,51444, flow_id=43c01a35-414c-4261-b263-cca6f7a61ef1, task_id=4413556a-b81b-42f8-9f7e-769951708557 (api:48)
2019-05-20 11:16:40,792+0200 INFO  (jsonrpc/2) [vdsm.api] FINISH createVolume error=Invalid parameter: 'DiskType=2' from=::ffff:10.0.0.1,51444, flow_id=43c01a35-414c-4261-b263-cca6f7a61ef1, task_id=4413556a-b81b-42f8-9f7e-769951708557 (api:52)
2019-05-20 11:16:40,792+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='4413556a-b81b-42f8-9f7e-769951708557') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in createVolume
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1454, in createVolume
    volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 852, in validateCreateVolumeParams
    volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 601, in validateCreateVolumeParams
    raise se.InvalidParameterException("DiskType", diskType)
InvalidParameterException: Invalid parameter: 'DiskType=2'
2019-05-20 11:16:40,792+0200 INFO  (jsonrpc/2) [storage.TaskManager.Task] (Task='4413556a-b81b-42f8-9f7e-769951708557') aborting: Task is aborted: u"Invalid parameter: 'DiskType=2'" - code 100 (task:1181)
2019-05-20 11:16:40,792+0200 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH createVolume error=Invalid parameter: 'DiskType=2' (dispatcher:83)
2019-05-20 11:16:40,793+0200 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Volume.create failed (error 1000) in 0.00 seconds (__init__:312)

Expected result:

Disk created and attached to the VM

Workaround:

Downgrading the hosts to vdsm-4.20 fixes the issue.

Comment 5 Nir Soffer 2019-06-25 09:16:05 UTC
This is a regression in 4.3, breaking compatibility with 4.1 engine.

I suggest to mark this as blocker for 4.3.5.

Comment 8 Avihai 2019-06-30 07:11:37 UTC
Hi Shani,

This bug is marked as "Blocks:	1723873" which in fact it does not block it but it's the 4.3.5 cloned bug.
Can you please remove the "Blocks" marking ?

Comment 9 Avihai 2019-06-30 07:14:31 UTC
Tal, I think we support 2 major builds back meaning that in 4.4 we will not support 4.1(only 4.2,4.3) so this bug is not relevant in 4.4.
If this is relevant we will be only able to test this with a 4.2 host (running 4.4 engine)
WDYT?

Comment 10 shani 2019-06-30 07:16:26 UTC
(In reply to Avihai from comment #8)
> Hi Shani,
> 
> This bug is marked as "Blocks:	1723873" which in fact it does not block it
> but it's the 4.3.5 cloned bug.
> Can you please remove the "Blocks" marking ?

It seems that Nir suggested to mark it as a blocker (comment 5).
Nir, what do you say?

Comment 11 Avihai 2019-06-30 11:32:05 UTC
I think what Nir said in comment 5 is what we so with Yosi today which introduces a regression bug in 4.3.5(engine+VDSM) which is a new regression bug Yossi will open.

At the original scenario (Bug 1723873) which use engine 4.1 the fix works thus Bug 1723873 is verified and not blocked.
The scenario with 4.3.5(engine+VDSM) which is a new Bug 1725390 seen after this fix was introduced and with a different scenario(engine is 4.3.5 not 4.1).

Comment 12 Sandro Bonazzola 2019-07-02 09:56:54 UTC
(In reply to Nir Soffer from comment #5)
> This is a regression in 4.3, breaking compatibility with 4.1 engine.
> 
> I suggest to mark this as blocker for 4.3.5.

Bug has been cloned to 4.3.5 in bug #1723873 which is already in verified state

Comment 13 Nir Soffer 2019-07-03 20:43:16 UTC
(In reply to Sandro Bonazzola from comment #12)
> (In reply to Nir Soffer from comment #5)
> > This is a regression in 4.3, breaking compatibility with 4.1 engine.
> > 
> > I suggest to mark this as blocker for 4.3.5.
> 
> Bug has been cloned to 4.3.5 in bug #1723873 which is already in verified
> state

The fix was verified to allow diskType=2 - sent by engine 4.1. But the same
fix introduce a regression, failing when diskType="2", sent by engine >= 4.2
when using 4.1 compatibility version.

I don't know why the original bug was marked as verified when the fix introduced
a regression. Anyway we cannot release 4.3.5 without resolving this bug.

Comment 14 shani 2019-07-04 07:17:17 UTC
This patch converts the disktype to a string, so it handles both scenarios.
I believe that one solves the bug.

Comment 17 mxie@redhat.com 2019-08-09 09:18:54 UTC
Created attachment 1602072 [details]
engine-v2v-import.log

Comment 18 mxie@redhat.com 2019-08-09 09:19:23 UTC
Created attachment 1602073 [details]
vdsm-v2v-import.log

Comment 19 Nir Soffer 2019-08-13 07:49:27 UTC
Daniel, is it possible that some engine version use 'DiskType=1'?
(see comment 15).

Comment 20 Daniel Erez 2019-08-13 17:06:15 UTC
(In reply to Nir Soffer from comment #19)
> Daniel, is it possible that some engine version use 'DiskType=1'?
> (see comment 15).

The engine should send only the disk type as a string or '2' for legacy.

@Ming - Can you please attach also the engine logs?

Comment 21 Daniel Erez 2019-08-13 17:20:23 UTC
(In reply to Daniel Erez from comment #20)
> (In reply to Nir Soffer from comment #19)
> > Daniel, is it possible that some engine version use 'DiskType=1'?
> > (see comment 15).
> 
> The engine should send only the disk type as a string or '2' for legacy.
> 
> @Ming - Can you please attach also the engine logs?

And vdsm logs as well please.

Comment 22 mxie@redhat.com 2019-08-14 02:36:32 UTC
(In reply to Daniel Erez from comment #21)
> (In reply to Daniel Erez from comment #20)
> > (In reply to Nir Soffer from comment #19)
> > > Daniel, is it possible that some engine version use 'DiskType=1'?
> > > (see comment 15).
> > 
> > The engine should send only the disk type as a string or '2' for legacy.
> > 
> > @Ming - Can you please attach also the engine logs?
> 
> And vdsm logs as well please.

Hi Daniel,
  
    Pls refer to attachment files "engine-v2v-import.log" and "vdsm-v2v-import.log"

Comment 23 mxie@redhat.com 2019-08-21 06:20:49 UTC
Hi Daniel,

   Could you please help to confirm if the problem of comment15 is duplicated with the bug? If not, I will file new one because "DiskType=1" problem has blocked our v2v auto testing and needs to be fixed as soon as possible. thanks!

Comment 24 Daniel Erez 2019-08-27 14:03:05 UTC
(In reply to mxie from comment #23)
> Hi Daniel,
> 
>    Could you please help to confirm if the problem of comment15 is
> duplicated with the bug? If not, I will file new one because "DiskType=1"
> problem has blocked our v2v auto testing and needs to be fixed as soon as
> possible. thanks!

I think we can keep tracking the issue on this bug as the fix seems to expose it.
I.e. the fix changed LEGACY_DATA_DISKTYPE from "1" to "2", but seems that there's
a flow in copyImage that still expect "1" (see log [*]).

As the copyImage flow uses the image metadata, I guess that we might have
'DISKTYPE=1' in the source image metadata. I could reproduce the issue when
setting it manually in the meta file.

@Ming - is the source image (in export domain) still available? can you please
attach its .meta file?

@Nir - do you recall having 'DISKTYPE=1' in the meta file in some point?

[*]
2019-08-09 13:13:16,922+0800 ERROR (tasks/9) [storage.Image] Unexpected error (image:797)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 788, in copyCollapsed
    initialSize=initialSizeBlk)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 912, in createVolume
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1154, in create
    volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 902, in validateCreateVolumeParams
    volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 632, in validateCreateVolumeParams
    raise se.InvalidParameterException("DiskType", diskType)
InvalidParameterException: Invalid parameter: 'DiskType=1'
2019-08-09 13:13:16,923+0800 ERROR (tasks/9) [storage.TaskManager.Task] (Task='500c225b-81a0-4a00-9f88-99b30c608755') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1622, in copyImage
    postZero, force, discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 799, in copyCollapsed
    (dstVolUUID, str(e)))

Comment 25 Nir Soffer 2019-08-27 14:47:35 UTC
(In reply to Daniel Erez from comment #24)
> (In reply to mxie from comment #23)
> As the copyImage flow uses the image metadata, I guess that we might have
> 'DISKTYPE=1' in the source image metadata. I could reproduce the issue when
> setting it manually in the meta file.

We don't support modifying volume metadata with invalid values.

> @Nir - do you recall having 'DISKTYPE=1' in the meta file in some point?

No.

In the past vdsm was writing anything engine was sending to storage, and during
copy reading metadata from source image and creating metadata fro target image
based on it.

If we have a volume with disktype=1, then:
- engine sent this value to vdsm
- or non-storage vdsm code modified this value during import
- or some tool modified this value bypassing vdsm

> [*]
> 2019-08-09 13:13:16,922+0800 ERROR (tasks/9) [storage.Image] Unexpected
> error (image:797)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 788,
> in copyCollapsed
>     initialSize=initialSizeBlk)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 912, in


This code read source volume parameters:

 730                 volParams = srcVol.getVolumeParams()

And call createVolume with:

 777                 destDom.createVolume(
 778                     imgUUID=dstImgUUID,
 779                     capacity=volParams['capacity'],
 780                     volFormat=dstVolFormat,
 781                     preallocate=volParams['prealloc'],
 782                     diskType=volParams['disktype'],
 783                     volUUID=dstVolUUID,
 784                     desc=descr,
 785                     srcImgUUID=sc.BLANK_UUID,
 786                     srcVolUUID=sc.BLANK_UUID,
 787                     initial_size=initial_size)

> createVolume
>     initialSize=initialSize)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1154,
> in create
>     volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 902, in
> validateCreateVolumeParams
>     volFormat, srcVolUUID, diskType=diskType, preallocate=preallocate)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 632, in
> validateCreateVolumeParams
>     raise se.InvalidParameterException("DiskType", diskType)
> InvalidParameterException: Invalid parameter: 'DiskType=1'

So this means the source volume has DISKTYPE=1 in the metadata.

We must have the logs of the flow creating the source volume metadata, or
instructions to reproduce such volume.

However since we got several reports about this, even if we don't understand
how we got "1" there, I think we can allow this value when reading old metadata
by converting the legacy values "1" and "2" to "DATA".

Comment 26 mxie@redhat.com 2019-08-28 06:12:41 UTC
> I think we can keep tracking the issue on this bug as the fix seems to expose it.
> I.e. the fix changed LEGACY_DATA_DISKTYPE from "1" to "2", but seems that there's
> a flow in copyImage that still expect "1" (see log [*]).
> @Ming - is the source image (in export domain) still available? can you
> please
> attach its .meta file?

Hi Daniel,

   Already attached the .meta file. As the bug has blocked virt-v2v testing now,it will be very trouble for us if keep tracking "DiskType=1" problem on this bug because I saw this bug will be fixed on rhv4.4 Can we open a new bug on rhv4.3.z to fix the "DiskType=1" problem?

Thanks

Comment 27 mxie@redhat.com 2019-08-28 06:14:02 UTC
Created attachment 1608841 [details]
disktype=1.meta

Comment 28 Daniel Erez 2019-08-28 10:50:10 UTC
(In reply to mxie from comment #26)
> > I think we can keep tracking the issue on this bug as the fix seems to expose it.
> > I.e. the fix changed LEGACY_DATA_DISKTYPE from "1" to "2", but seems that there's
> > a flow in copyImage that still expect "1" (see log [*]).
> > @Ming - is the source image (in export domain) still available? can you
> > please
> > attach its .meta file?
> 
> Hi Daniel,
> 
>    Already attached the .meta file. As the bug has blocked virt-v2v testing
> now,it will be very trouble for us if keep tracking "DiskType=1" problem on
> this bug because I saw this bug will be fixed on rhv4.4 Can we open a new
> bug on rhv4.3.z to fix the "DiskType=1" problem?

Sure, you can open a new bug if more convenient, just link to this one for reference.
I see that in the .meta file we indeed have "DiskType=1".
Not sure how that happened, please attach in the new bug, exact instructions for
reproducing it or logs of the flow when this volume was created (engine and vdsm). 
As Nir suggested, I think we'll have to allow both "1" and "2" in vdsm to
support those legacy volumes. 

> 
> Thanks

Comment 29 Daniel Gur 2019-08-28 13:12:51 UTC
sync2jira

Comment 30 Daniel Gur 2019-08-28 13:17:03 UTC
sync2jira

Comment 33 RHV bug bot 2019-10-22 17:25:55 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 34 RHV bug bot 2019-10-22 17:39:33 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 35 RHV bug bot 2019-10-22 17:46:45 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 36 RHV bug bot 2019-10-22 18:02:33 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 37 Avihai 2019-10-31 11:45:23 UTC
Hi Shani,
The reproduction scenario starts with creating a cluster and datacenter with 4.1 compatibility version.
As this bug should be tested in 4.4 there will not be an option to create a 4.1 compatibility version cluster/DC ( only 4.2 or later).
How do you suggest to reproduce/verify this bug in 4.4?

Comment 40 RHV bug bot 2019-11-19 11:53:20 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 41 RHV bug bot 2019-11-19 12:03:21 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 44 RHV bug bot 2019-12-13 13:16:42 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops