Bug 1707932 - [downstream clone - 4.3.4] Moving disk results in wrong SIZE/CAP key in the volume metadata
Summary: [downstream clone - 4.3.4] Moving disk results in wrong SIZE/CAP key in the v...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.2.8
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ovirt-4.3.4
: 4.3.1
Assignee: Vojtech Juranek
QA Contact: Shir Fishbain
URL:
Whiteboard:
Depends On: 1700623
Blocks: 1707934
TreeView+ depends on / blocked
 
Reported: 2019-05-08 17:32 UTC by RHV bug bot
Modified: 2020-09-14 06:15 UTC (History)
17 users (show)

Fixed In Version: vdsm-4.30.15
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1700623
: 1707934 (view as bug list)
Environment:
Last Closed: 2019-06-20 14:48:41 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4065541 0 Troubleshoot None After creating snapshot the VM paused with IO Error, the disk changed size and seems corrupted. 2019-05-08 17:33:19 UTC
Red Hat Knowledge Base (Solution) 4308991 0 None None None 2020-09-14 06:15:11 UTC
Red Hat Product Errata RHBA-2019:1567 0 None None None 2019-06-20 14:48:49 UTC
oVirt gerrit 99813 0 'None' MERGED tests: add tests for volume size different from parent size 2020-09-24 07:12:34 UTC
oVirt gerrit 99814 0 'None' MERGED volume: remove size override when creating a volume with parent 2020-09-24 07:12:37 UTC
oVirt gerrit 99815 0 'None' MERGED tests: enable tests for volume metadata size 2020-09-24 07:12:33 UTC
oVirt gerrit 99832 0 'None' MERGED volume: Repair volume capacity when preparing a volume 2020-09-24 07:12:33 UTC
oVirt gerrit 99837 0 'None' MERGED hsm: Repair volume capacity when preparing an image 2020-09-24 07:12:33 UTC
oVirt gerrit 99890 0 'None' MERGED volume: Validate volume size when creating snapshot 2020-09-24 07:12:33 UTC
oVirt gerrit 99934 0 'None' ABANDONED tests: Test creating volume based on a template 2020-09-24 07:12:33 UTC
oVirt gerrit 99983 0 'None' MERGED image: Use initialSize when creating volume 2020-09-24 07:12:32 UTC
oVirt gerrit 99984 0 'None' MERGED tests: add tests for updateInvalidatedSize() 2020-09-24 07:12:36 UTC
oVirt gerrit 99985 0 'None' MERGED tests: add test for fixing corrupted volume capacity metadata 2020-09-24 07:12:32 UTC
oVirt gerrit 99986 0 'None' MERGED test: make monkeypatching in merge test more precise 2020-09-24 07:12:32 UTC
oVirt gerrit 99987 0 'None' MERGED tests: Test creating snapshot with invalid size 2020-09-24 07:12:36 UTC

Description RHV bug bot 2019-05-08 17:32:48 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1700623 +++
======================================================================

Description of problem:

Moving the disk from storage A to B results in wrong SIZE key in the volume metadata on B if the volume has been previously extended.

Before moving
=============

# lvs -o +tags| grep b9fd9e73-32d3-473a-8cb5-d113602f76e1 | awk -F ' ' '{print $1,$2,$4,$5}'
359c2ea7-0a73-4296-8109-b799d9bfbd08 51e44de8-2fc0-4e99-8860-6820ff023108 1.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_23,PU_5f478dfb-78bb-4217-ad63-6927dab7cc90
5f478dfb-78bb-4217-ad63-6927dab7cc90 51e44de8-2fc0-4e99-8860-6820ff023108 5.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_22,PU_00000000-0000-0000-0000-000000000000

# dd status=none if=/dev/51e44de8-2fc0-4e99-8860-6820ff023108/metadata count=1 bs=512 skip=22 | grep -a SIZE
SIZE=10485760

# dd status=none if=/dev/51e44de8-2fc0-4e99-8860-6820ff023108/metadata count=1 bs=512 skip=23 | grep -a SIZE
SIZE=20971520

After moving
============

# lvs -o +tags| grep b9fd9e73-32d3-473a-8cb5-d113602f76e1 | awk -F ' ' '{print $1,$2,$4,$5}'
359c2ea7-0a73-4296-8109-b799d9bfbd08 43c67df7-2293-4756-9aa3-de09d67d7050 1.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_95,PU_5f478dfb-78bb-4217-ad63-6927dab7cc90
5f478dfb-78bb-4217-ad63-6927dab7cc90 43c67df7-2293-4756-9aa3-de09d67d7050 5.00g IU_b9fd9e73-32d3-473a-8cb5-d113602f76e1,MD_93,PU_00000000-0000-0000-0000-000000000000

# dd status=none if=/dev/43c67df7-2293-4756-9aa3-de09d67d7050/metadata count=1 bs=512 skip=93 | grep -a SIZE
SIZE=10485760

# dd status=none if=/dev/43c67df7-2293-4756-9aa3-de09d67d7050/metadata count=1 bs=512 skip=95 | grep -a SIZE
SIZE=10485760       <----------------------- wrong

The SIZE key in the metadata went from 20971520 on SRC SD to 10485760 (same as parent).

Add this to BZ1700189 and the severity of this is urgent.

Version-Release number of selected component (if applicable):
vdsm-4.20.47-1.el7ev
rhvm-4.2.8.5-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM with 5GB disk
2. Snapshot it
3. Extend disk by 5GB
4. Move this to another SD

Additional info:
* Also happens on LIVE STORAGE MIGRATION
* The entire chain gets the wrong size, not just he leaf.

(Originally by Germano Veit Michel)

Comment 1 RHV bug bot 2019-05-08 17:32:51 UTC
Note: this was block storage to block storage

(Originally by Germano Veit Michel)

Comment 2 RHV bug bot 2019-05-08 17:32:53 UTC
The create volume command on DST SD looks right, not yet sure why the metadata is wrong.

2019-04-17 09:58:20,359+1000 INFO  (jsonrpc/2) [vdsm.api] START createVolume(sdUUID=u'43c67df7-2293-4756-9aa3-de09d67d7050', spUUID=u'da42e5a5-f6f7-49b4-8256-2adf690ddf4c', imgUUID=u'b9fd9e73-32d3-473a-8cb5-d113602f76e1', size=u'10737418240', volFormat=4, preallocate=2, diskType=u'DATA', volUUID=u'359c2ea7-0a73-4296-8109-b799d9bfbd08', desc=None, srcImgUUID=u'b9fd9e73-32d3-473a-8cb5-d113602f76e1', srcVolUUID=u'5f478dfb-78bb-4217-ad63-6927dab7cc90', initialSize=u'976128931') from=::ffff:10.64.24.161,49332, flow_id=23cc02dc-502c-4d33-9271-3f5b6b89a69a, task_id=c2e90abb-fa9c-415d-b9f7-e9d13520971d (api:46)

(Originally by Germano Veit Michel)

Comment 5 RHV bug bot 2019-05-08 17:32:59 UTC
The issue is this line in volume.py:
 
1148                 # Override the size with the size of the parent
1149                 size = volParent.getSize()

When creating a volume with a parent volume, vdsm override the size sent
by engine silently.

The code was added in

commit 8a0236a2fdf4e81f9b73e9279606053797e14753
Author: Federico Simoncelli <fsimonce>
Date:   Tue Apr 17 18:33:51 2012 +0000

    Unify the volume creation code in volume.create
    
    This patch lays out the principles of the create volume flow (unified
    both for block and file storage domains).
    
    Signed-off-by: Federico Simoncelli <fsimonce>
    Change-Id: I0e44da32351a420f0536505985586b24ded81a2a
    Reviewed-on: http://gerrit.ovirt.org/3627
    Reviewed-by: Allon Mureinik <amureini>
    Reviewed-by: Ayal Baron <abaron>

The review does not exist on gerrit, and there is no info explaning why vdsm
need to  override the size sent by engine silently and use the parent size.
Maybe this was needed in the past to work around some engine bug or issue in
another vdsm flow.

So it seems that creating a volume chain with different sizes was always broken.

I think we need to:
- remove this override
- check if removing it breaks some other flow - may break snapshot creation if engine
  send the wrong size, maybe this code "fixes" such case.
- verify metadata size when preparing existing volume, and fix inconsistencies between
  qcow2 virtual size and volume size

(Originally by Nir Soffer)

Comment 8 RHV bug bot 2019-05-08 17:33:05 UTC
We have 2 patches in review:

- https://gerrit.ovirt.org/c/99539/ - this fixes the root cause, creating volumes
  with bad metadata.

- https://gerrit.ovirt.org/c/99541 - this currently fail to prepare a volume with 
  bad metadata, so it would prevent corruption of the image when creating a snapshot,
  but it will fail starting a VM or moving a disk with such volume. I think we can
  fix bad metadata when preparing a volume, since we already do this for special 
  zero metadata size.

Both patches are small and simple and backport to 4.2 should be possible. When this
will be fixed upstream we can evaluate backport to 4.2.

(Originally by Nir Soffer)

Comment 13 Nir Soffer 2019-05-09 16:55:10 UTC
Removing master patches, only 4.3 patches should be attached here.

Comment 14 RHV bug bot 2019-05-16 15:29:12 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Project 'vdsm'/Component 'ovirt-engine' mismatch]

For more info please contact: rhv-devops

Comment 16 Shir Fishbain 2019-05-26 16:05:06 UTC
Verified 

The SIZE key in the metadata is the same as a parent.
Metadate disk size didn't change as a result of the moving action 

ovirt-engine-4.3.4.1-0.1.el7.noarch
vdsm-4.30.16-3.el7ev.x86_64

Comment 18 errata-xmlrpc 2019-06-20 14:48:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1567

Comment 19 Daniel Gur 2019-08-28 13:12:58 UTC
sync2jira

Comment 20 Daniel Gur 2019-08-28 13:17:10 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.