Bug 1700623

Summary: Moving disk results in wrong SIZE/CAP key in the volume metadata
Product: Red Hat Enterprise Virtualization Manager Reporter: Germano Veit Michel <gveitmic>
Component: vdsmAssignee: Vojtech Juranek <vjuranek>
Status: CLOSED ERRATA QA Contact: Shir Fishbain <sfishbai>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.2.8CC: aefrat, bcholler, eshenitz, fsimonce, jinjli, lsurette, mkalinin, nsoffer, pelauter, pvilayat, rdlugyhe, rhodain, royoung, srevivo, tnisan, vjuranek, ycui
Target Milestone: ovirt-4.4.0Keywords: ZStream
Target Release: ---Flags: lsvaty: testing_plan_complete-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rhv-4.4.0-29 Doc Type: Bug Fix
Doc Text:
Previously, moving a disk resulted in the wrong SIZE/CAP key in the volume metadata. This happened because creating a volume that had a parent overwrote the size of the newly-created volume with the parent size. As a result, the volume metadata contained the wrong volume size value. The current release fixes this issue, so the volume metadata contains the correct value.
Story Points: ---
Clone Of:
: 1707932 (view as bug list) Environment:
Last Closed: 2020-08-04 13:26:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1700189, 1703275, 1707932, 1707934    

Description Germano Veit Michel 2019-04-17 00:09:49 UTC
Description of problem:

Moving the disk from storage A to B results in wrong SIZE key in the volume metadata on B if the volume has been previously extended.

Before moving
=============

# lvs -o +tags| grep b9fd9e73-32d3-473a-8cb5-d113602f76e1 | awk -F ' ' '{print $1,$2,$4,$5}'
359c2ea7-0a73-4296-8109-b799d9bfbd08 51e44de8-2fc0-4e99-8860-6820ff023108 1.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_23,PU_5f478dfb-78bb-4217-ad63-6927dab7cc90
5f478dfb-78bb-4217-ad63-6927dab7cc90 51e44de8-2fc0-4e99-8860-6820ff023108 5.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_22,PU_00000000-0000-0000-0000-000000000000

# dd status=none if=/dev/51e44de8-2fc0-4e99-8860-6820ff023108/metadata count=1 bs=512 skip=22 | grep -a SIZE
SIZE=10485760

# dd status=none if=/dev/51e44de8-2fc0-4e99-8860-6820ff023108/metadata count=1 bs=512 skip=23 | grep -a SIZE
SIZE=20971520

After moving
============

# lvs -o +tags| grep b9fd9e73-32d3-473a-8cb5-d113602f76e1 | awk -F ' ' '{print $1,$2,$4,$5}'
359c2ea7-0a73-4296-8109-b799d9bfbd08 43c67df7-2293-4756-9aa3-de09d67d7050 1.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_95,PU_5f478dfb-78bb-4217-ad63-6927dab7cc90
5f478dfb-78bb-4217-ad63-6927dab7cc90 43c67df7-2293-4756-9aa3-de09d67d7050 5.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_93,PU_00000000-0000-0000-0000-000000000000

# dd status=none if=/dev/43c67df7-2293-4756-9aa3-de09d67d7050/metadata count=1 bs=512 skip=93 | grep -a SIZE
SIZE=10485760

# dd status=none if=/dev/43c67df7-2293-4756-9aa3-de09d67d7050/metadata count=1 bs=512 skip=95 | grep -a SIZE
SIZE=10485760       <----------------------- wrong

The SIZE key in the metadata went from 20971520 on SRC SD to 10485760 (same as parent).

Add this to BZ1700189 and the severity of this is urgent.

Version-Release number of selected component (if applicable):
vdsm-4.20.47-1.el7ev
rhvm-4.2.8.5-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM with 5GB disk
2. Snapshot it
3. Extend disk by 5GB
4. Move this to another SD

Additional info:
* Also happens on LIVE STORAGE MIGRATION
* The entire chain gets the wrong size, not just he leaf.

Comment 1 Germano Veit Michel 2019-04-17 00:11:46 UTC
Note: this was block storage to block storage

Comment 2 Germano Veit Michel 2019-04-17 00:24:00 UTC
The create volume command on DST SD looks right, not yet sure why the metadata is wrong.

2019-04-17 09:58:20,359+1000 INFO  (jsonrpc/2) [vdsm.api] START createVolume(sdUUID=u'43c67df7-2293-4756-9aa3-de09d67d7050', spUUID=u'da42e5a5-f6f7-49b4-8256-2adf690ddf4c', imgUUID=u'b9fd9e73-32d3-473a-8cb5-d113602f76e1', size=u'10737418240', volFormat=4, preallocate=2, diskType=u'DATA', volUUID=u'359c2ea7-0a73-4296-8109-b799d9bfbd08', desc=None, srcImgUUID=u'b9fd9e73-32d3-473a-8cb5-d113602f76e1', srcVolUUID=u'5f478dfb-78bb-4217-ad63-6927dab7cc90', initialSize=u'976128931') from=::ffff:10.64.24.161,49332, flow_id=23cc02dc-502c-4d33-9271-3f5b6b89a69a, task_id=c2e90abb-fa9c-415d-b9f7-e9d13520971d (api:46)

Comment 5 Nir Soffer 2019-04-17 08:46:01 UTC
The issue is this line in volume.py:
 
1148                 # Override the size with the size of the parent
1149                 size = volParent.getSize()

When creating a volume with a parent volume, vdsm override the size sent
by engine silently.

The code was added in

commit 8a0236a2fdf4e81f9b73e9279606053797e14753
Author: Federico Simoncelli <fsimonce>
Date:   Tue Apr 17 18:33:51 2012 +0000

    Unify the volume creation code in volume.create
    
    This patch lays out the principles of the create volume flow (unified
    both for block and file storage domains).
    
    Signed-off-by: Federico Simoncelli <fsimonce>
    Change-Id: I0e44da32351a420f0536505985586b24ded81a2a
    Reviewed-on: http://gerrit.ovirt.org/3627
    Reviewed-by: Allon Mureinik <amureini>
    Reviewed-by: Ayal Baron <abaron>

The review does not exist on gerrit, and there is no info explaning why vdsm
need to  override the size sent by engine silently and use the parent size.
Maybe this was needed in the past to work around some engine bug or issue in
another vdsm flow.

So it seems that creating a volume chain with different sizes was always broken.

I think we need to:
- remove this override
- check if removing it breaks some other flow - may break snapshot creation if engine
  send the wrong size, maybe this code "fixes" such case.
- verify metadata size when preparing existing volume, and fix inconsistencies between
  qcow2 virtual size and volume size

Comment 8 Nir Soffer 2019-05-05 10:00:54 UTC
We have 2 patches in review:

- https://gerrit.ovirt.org/c/99539/ - this fixes the root cause, creating volumes
  with bad metadata.

- https://gerrit.ovirt.org/c/99541 - this currently fail to prepare a volume with 
  bad metadata, so it would prevent corruption of the image when creating a snapshot,
  but it will fail starting a VM or moving a disk with such volume. I think we can
  fix bad metadata when preparing a volume, since we already do this for special 
  zero metadata size.

Both patches are small and simple and backport to 4.2 should be possible. When this
will be fixed upstream we can evaluate backport to 4.2.

Comment 13 Nir Soffer 2019-05-09 16:53:30 UTC
Remove 4.3 and 4.2 patches, the are listed in the clones for 4.4 and 4.2.

Comment 14 Nir Soffer 2019-05-09 17:02:06 UTC
Note that with 4.3 the default storage domain format is V5, and volume capacity
is stored in the CAP key in bytes. In older versions (e.g. V4) the volume
capacity is stored in the SIZE key in 512 bytes blocks.

Comment 15 Federico Simoncelli 2019-05-09 21:17:31 UTC
(In reply to Nir Soffer from comment #5)
> The issue is this line in volume.py:
>  
> 1148                 # Override the size with the size of the parent
> 1149                 size = volParent.getSize()
> 
> When creating a volume with a parent volume, vdsm override the size sent
> by engine silently.
> 
> The code was added in
> 
> commit 8a0236a2fdf4e81f9b73e9279606053797e14753
> Author: Federico Simoncelli <fsimonce>
> Date:   Tue Apr 17 18:33:51 2012 +0000
> 
>     Unify the volume creation code in volume.create


Actually the patch above seems to maintain an existing behavior, at that time only for block devices, but that today we consider problematic for either type (as the fix removes it from the common flow).

--- a/vdsm/storage/blockVolume.py
+++ b/vdsm/storage/blockVolume.py

-                # override size param by parent's size
-                size = pvol.getSize()

--- a/vdsm/storage/volume.py
+++ b/vdsm/storage/volume.py

+                # Override the size with the size of the parent
+                size = volParent.getSize()


Sadly indeed there's no previous history about why that was there in the first place, but I think as well it was to avoid using not up-to-date sizes from engine:

commit 8e53f0f22f3b52f39f4b47d273c279c9b27f1156
Author: Tabula Rasa <cleanslate>
Date:   Wed Jun 15 23:26:19 2011 +0300

    Initial commit

^8e53f0f22 156                 # override size param by parent's size
^8e53f0f22 157                 size = pvol.getSize()

Comment 16 Nir Soffer 2019-05-10 00:01:34 UTC
(In reply to Federico Simoncelli from comment #15)
> (In reply to Nir Soffer from comment #5)

Thanks for looking at this, I missed the older version of the code
which was slightly modified.

I found the patch adding this:

commit 5d02b60c693ce268722f5433a4e5833e5e346907 (HEAD)
Merge: debddc24a 7d018db0d
Author: Shahar Frank <sfrank>
Date:   Mon Mar 30 17:14:16 2009 +0300

    fix copy and SD metadata caching

But this patch also seems to preserve existing behavior:

@@@ -75,42 -76,42 +76,46 @@@ class BlockVolume(volume.Volume)
...

++                # override size param by parent's size
++                size = pvol.getSize()

...

--                log.debug("2%s" % pvol.getVolumePath())
--                size = pvol.getMetaParam(volume.SIZE)
--                log.debug("3%s" % pvol.getVolumePath())
                  res = pvol.clone(image_dir, volUUID, volFormat, preallocate)
...


Overriding the size seems to come from:

commit 0a46d211bb75ebbaee3f57ddc6eb01e04a3af94e
Author: Shahar Frank <sfrank>
Date:   Tue Mar 17 19:43:33 2009 +0200

    volume,fileVolume,BlockVolume - reorgenized. File seems to work


diff --git a/vdsm/storage/blockVolume.py b/vdsm/storage/blockVolume.py
index 74d05bdb6..f9f63c7c4 100644
--- a/vdsm/storage/blockVolume.py
+++ b/vdsm/storage/blockVolume.py
@@ -17,129 +17,137 @@ import volume

...

+                size = pvol.getMetaParam(volume.SIZE)

...


We don't have any info on the review and there is no documentation
explaining why the size is taken from the parent.

Maybe that time we did not implement resize for snapshots, so taking the
size from the parent was always correct.

Comment 17 Vojtech Juranek 2019-05-10 07:34:19 UTC
removed link to 4.3 backport

Comment 18 Germano Veit Michel 2019-05-15 01:02:16 UTC
Moving discussion from BZ1700189

(In reply to Nir Soffer from comment #13)
> (In reply to Germano Veit Michel from comment #12)
> 
> Germano, lets not use this bug, since the root cause is bug 1700623.
> 
> > I started writing the tool, which is rather simple. 
> > 
> > However, the tool needs to use Image.prepare before doing the
> > Volume.getQemuImageInfo to retrieve the real size from the qcow2 header. But
> > https://gerrit.ovirt.org/#/c/99793/ should fix the metadata on Image.prepare
> > without raising exception (different from your first point), so this is a
> > bit confusing regarding what the tool should do.
> > 
> > I think the tool would:
> > 1) Retrieve the metadata size of all volumes first
> > 2) Prepare all volumes and get qemu size (which would fix the metadata)
> > 3) Print a list of all the volumes that have size mismatch (and should be
> > already fixed by this point)
> > 
> > What do you think?
> 
> System including the patches does not need any tool, volumes will be fixed
> automatically when you start a vm, or when you prepare a disk.

What about already running VMs? New snapshtos do not call prepare since 4.2.
Any plans to cover this case as well?

> 
> A tool for fixing volumes is needed only for system without the fix (e.g.
> 4.2.8),
> and in this case preparing a volume will not fix anything.
> 

Right, but to get the tool one would need to upgrade. And by upgrading one will get the fixes, so it makes the tool pointless?
Because in best case scenario we would ship the tool at the same time of the fixes.

Do you agree?

> I think we can start with dump-volume-chain call, getting a json or aql with
> all volumes.
> 
> Then we can find the top volume in every chain and:
> 
> 1. Check if the image is prepared for a vm or by storage operation
>    I don't know a good way to do this, this is something engine manages.
> 
> 2. If the image is not used, prepare it with Image.prepare
> 
> 3. Get the backing chain info with all the info about the disk
> 
>     qemu-img info -U --backing-chain --output json /path/to/volume
> 
> 4. If the image was not used, tear it down with Image.teardown
> 
> With this you have all the info to report which volumes metadta is not
> synced with qcow2 metadata (qcow2 image) or actual file size (raw image).

In my draft tool I just prepare and teardown everything, expecting in-use images to fail to teardown.

One thing I am concerned is with VMs already running and affected (wrong SIZE in MD, but no qcow2 with wrong virtual size yet). Will the new fixes correct the SIZE in the MD before the new leaf is created with a wrong size? Will that patch of invalidate size on volume.prepare do this on the host running the VM before the snapshot is done?

Comment 19 Nir Soffer 2019-05-15 18:32:24 UTC
(In reply to Germano Veit Michel from comment #18)
> > System including the patches does not need any tool, volumes will be fixed
> > automatically when you start a vm, or when you prepare a disk.
> 
> What about already running VMs? New snapshtos do not call prepare since 4.2.
> Any plans to cover this case as well?


How do you get already running vm on a system with the fixes? You need
to upgrade vdsm, and upgrading vdsm is blocked if we have vms running
on a host.

The only way to have running vm with a new version include the fix is
to:
1. Start the vm after upgrade, preparing the volumes -> metadata fixed
2. Migrate vm from old host, preparing the volumes -> metadata fixed

> > A tool for fixing volumes is needed only for system without the fix (e.g.
> > 4.2.8),
> > and in this case preparing a volume will not fix anything.
> > 
> 
> Right, but to get the tool one would need to upgrade. 

No, the tool should be external to vdsm so you can use it on
a 4.2.x system in the field that you cannot upgrade at this point.

Once metadata is fixed, new snapshots will be safe. However if a new
snapshot is extended and move again, we will have broken metadata again.

> And by upgrading one
> will get the fixes, so it makes the tool pointless?

If you don't plan to fix running vms with broken metadata, that would be 
corrupted by creating a snapshot, and think that the best way is to upgrade,
then the tool is not very helpful.

> > I think we can start with dump-volume-chain call, getting a json or aql with
> > all volumes.
> > 
> > Then we can find the top volume in every chain and:
> > 
> > 1. Check if the image is prepared for a vm or by storage operation
> >    I don't know a good way to do this, this is something engine manages.
> > 
> > 2. If the image is not used, prepare it with Image.prepare
> > 
> > 3. Get the backing chain info with all the info about the disk
> > 
> >     qemu-img info -U --backing-chain --output json /path/to/volume
> > 
> > 4. If the image was not used, tear it down with Image.teardown
> > 
> > With this you have all the info to report which volumes metadta is not
> > synced with qcow2 metadata (qcow2 image) or actual file size (raw image).
> 
> In my draft tool I just prepare and teardown everything, expecting in-use
> images to fail to teardown.

It can break in many ways if some other code think that it owns the volume
while you try check and fix it, and when you teardown you can break the other
code.

There is one way to access volumes which are not used by a vm safely. Start
image transfer with download direction for a volume. This locks the disk and
prepare the entire chain. Then you can check and fix the chain safely.
When you finish, you finalize the transfer, releasing the locks.

If the image is used by a VM, you will not be able to start image transfer
for the active volume, but you also don't need to prepare or teardown anything.

If telling the users that they should not modify any disk or vm while running
the tool is good enough, then maybe preparing and tearing down is good enough.

> One thing I am concerned is with VMs already running and affected (wrong
> SIZE in MD, but no qcow2 with wrong virtual size yet). Will the new fixes
> correct the SIZE in the MD before the new leaf is created with a wrong size?

If the system was upgraded the metadta will be fixed before the vm is running.

> Will that patch of invalidate size on volume.prepare do this on the host
> running the VM before the snapshot is done?

No, before the vm was running.

Comment 20 Germano Veit Michel 2019-05-16 01:18:24 UTC
(In reply to Nir Soffer from comment #19)
> (In reply to Germano Veit Michel from comment #18)
> > > System including the patches does not need any tool, volumes will be fixed
> > > automatically when you start a vm, or when you prepare a disk.
> > 
> > What about already running VMs? New snapshtos do not call prepare since 4.2.
> > Any plans to cover this case as well?
> 
> 
> How do you get already running vm on a system with the fixes? You need
> to upgrade vdsm, and upgrading vdsm is blocked if we have vms running
> on a host.
> 
> The only way to have running vm with a new version include the fix is
> to:
> 1. Start the vm after upgrade, preparing the volumes -> metadata fixed
> 2. Migrate vm from old host, preparing the volumes -> metadata fixed

Of course!

> 
> > > A tool for fixing volumes is needed only for system without the fix (e.g.
> > > 4.2.8),
> > > and in this case preparing a volume will not fix anything.
> > > 
> > 
> > Right, but to get the tool one would need to upgrade. 
> 
> No, the tool should be external to vdsm so you can use it on
> a 4.2.x system in the field that you cannot upgrade at this point.
> 
> Once metadata is fixed, new snapshots will be safe. However if a new
> snapshot is extended and move again, we will have broken metadata again.
> 
> > And by upgrading one
> > will get the fixes, so it makes the tool pointless?
> 
> If you don't plan to fix running vms with broken metadata, that would be 
> corrupted by creating a snapshot, and think that the best way is to upgrade,
> then the tool is not very helpful.

Upgrading is always the best way, as it would prevent new volumes from hitting the problem as well.
I'm not seeing much value in this tool, and even worse if its not part of vdsm.

> 
> > > I think we can start with dump-volume-chain call, getting a json or aql with
> > > all volumes.
> > > 
> > > Then we can find the top volume in every chain and:
> > > 
> > > 1. Check if the image is prepared for a vm or by storage operation
> > >    I don't know a good way to do this, this is something engine manages.
> > > 
> > > 2. If the image is not used, prepare it with Image.prepare
> > > 
> > > 3. Get the backing chain info with all the info about the disk
> > > 
> > >     qemu-img info -U --backing-chain --output json /path/to/volume
> > > 
> > > 4. If the image was not used, tear it down with Image.teardown
> > > 
> > > With this you have all the info to report which volumes metadta is not
> > > synced with qcow2 metadata (qcow2 image) or actual file size (raw image).
> > 
> > In my draft tool I just prepare and teardown everything, expecting in-use
> > images to fail to teardown.
> 
> It can break in many ways if some other code think that it owns the volume
> while you try check and fix it, and when you teardown you can break the other
> code.
> 

Ohh, I thought it was safe as we have some wrong teardowns around like BZ1644142.

> There is one way to access volumes which are not used by a vm safely. Start
> image transfer with download direction for a volume. This locks the disk and
> prepare the entire chain. Then you can check and fix the chain safely.
> When you finish, you finalize the transfer, releasing the locks.
> 
> If the image is used by a VM, you will not be able to start image transfer
> for the active volume, but you also don't need to prepare or teardown
> anything.
> 
> If telling the users that they should not modify any disk or vm while running
> the tool is good enough, then maybe preparing and tearing down is good
> enough.
> 
> > One thing I am concerned is with VMs already running and affected (wrong
> > SIZE in MD, but no qcow2 with wrong virtual size yet). Will the new fixes
> > correct the SIZE in the MD before the new leaf is created with a wrong size?
> 
> If the system was upgraded the metadta will be fixed before the vm is
> running.
> 
> > Will that patch of invalidate size on volume.prepare do this on the host
> > running the VM before the snapshot is done?
> 
> No, before the vm was running.

IMHO:
- This tool would be too complex to do things safely
- This tool would not be too useful, the only good way forward is to upgrade to avoid new volumes being created with the wrong size.

If a customer wants to stay on older versions (risking hitting the problem again) we can do manual checks, its not hard to script a qemu-img info on all open volumes on each host and grab the entire metadata LV, we do similar things all the time.
We already offered this option to the ticket attached to this case.

So, I think the best plan of action is to wait for you to finish 4.2.z and 4.3, and instruct customers to upgrade all hosts while migrating VMs around. If all hosts have been upgraded, then all running VMs must be fixed, and the remaining ones will be fixed on power up.

Agreed?

Comment 21 Nir Soffer 2019-05-16 13:28:29 UTC
(In reply to Germano Veit Michel from comment #20)
I think you got it right.

Comment 22 Daniel Gur 2019-08-28 13:13:03 UTC
sync2jira

Comment 23 Daniel Gur 2019-08-28 13:17:16 UTC
sync2jira

Comment 24 RHV bug bot 2019-10-22 17:25:53 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 25 RHV bug bot 2019-10-22 17:39:01 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 26 RHV bug bot 2019-10-22 17:46:19 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 27 RHV bug bot 2019-10-22 18:02:07 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 28 RHV bug bot 2019-11-19 11:53:15 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 29 RHV bug bot 2019-11-19 12:03:12 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 30 RHEL Program Management 2019-11-19 12:33:09 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 31 RHV bug bot 2019-12-13 13:16:37 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 32 RHV bug bot 2019-12-20 17:46:05 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 33 RHV bug bot 2020-01-08 14:50:23 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 34 RHV bug bot 2020-01-08 15:18:37 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 35 RHV bug bot 2020-01-24 19:52:05 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 37 Shir Fishbain 2020-04-06 20:41:34 UTC
Verified

The SIZE key in the metadata is the same as a parent.
Metadate disk size didn't change as a result of the moving action. Succeded also when the VM is up.

vdsm-4.40.11-1.el8ev.x86_64
ovirt-engine-4.4.0-0.31.master.el8ev.noarch

Steps to reproduce:
1. Create vm with 50g disk
disk_id = a17d2cc7-2faf-4847-b076-2b01c13d1024
sd id = iscsi_0: 042adaf2-c4aa-485b-a0b2-38cd2e64e912
vm_id = 64c715e9-0b03-4882-a113-11b3367f61da
base volume/image ID= c74de4d1-0bd0-492e-af1c-042668ac43f1 

Checking the volume:
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
image: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


2. Create snapshot , one more image added : eed31393-a765-42a4-8435-06c69625707c

Checking the volumes:
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
image: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
image: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 0 B
cluster_size: 65536
backing file: c74de4d1-0bd0-492e-af1c-042668ac43f1 (actual path: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


3. extend disk to 100g
At this point base disk is 50g and the top is 100g

Checking the volumes:
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
image: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
image: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/eed31393-a765-42a4-8435-06c69625707c
file format: qcow2
virtual size: 100 GiB (107374182400 bytes)
disk size: 0 B
cluster_size: 65536
backing file: c74de4d1-0bd0-492e-af1c-042668ac43f1 (actual path: /dev/042adaf2-c4aa-485b-a0b2-38cd2e64e912/c74de4d1-0bd0-492e-af1c-042668ac43f1)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


Move to another SD (iscsi_2 = c90fcfaa-4c0f-4fd2-931e-10131bba3e56):
Checking the volumes:
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/c74de4d1-0bd0-492e-af1c-042668ac43f1
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/c74de4d1-0bd0-492e-af1c-042668ac43f1
image: /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/c74de4d1-0bd0-492e-af1c-042668ac43f1
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# lvchange -ay /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/eed31393-a765-42a4-8435-06c69625707c
[root@storage-ge8-vdsm2 a17d2cc7-2faf-4847-b076-2b01c13d1024]# qemu-img info /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/eed31393-a765-42a4-8435-06c69625707c
image: /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/eed31393-a765-42a4-8435-06c69625707c
file format: qcow2
virtual size: 100 GiB (107374182400 bytes)
disk size: 0 B
cluster_size: 65536
backing file: c74de4d1-0bd0-492e-af1c-042668ac43f1 (actual path: /dev/c90fcfaa-4c0f-4fd2-931e-10131bba3e56/c74de4d1-0bd0-492e-af1c-042668ac43f1)
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 45 errata-xmlrpc 2020-08-04 13:26:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246