Bug 1877790
| Summary: | lsm causes disk to change from RAW to QCOW2, but database is not updated | |||
|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Jean-Louis Dupond <jean-louis> | |
| Component: | BLL.Storage | Assignee: | Benny Zlotnik <bzlotnik> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Evelina Shames <eshames> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.4.1 | CC: | aefrat, bugs, bzlotnik, gveitmic, jean-louis, michal.skrivanek, tnisan | |
| Target Milestone: | ovirt-4.4.2 | Flags: | pm-rhel:
ovirt-4.4+
michal.skrivanek: blocker? tnisan: devel_ack+ |
|
| Target Release: | 4.4.2.6 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | ovirt-engine-4.4.2.6 | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1878341 (view as bug list) | Environment: | ||
| Last Closed: | 2020-09-18 07:13:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1878341 | |||
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again. I can easily reproduce the same on:
* vdsm-4.40.22-1.el8ev.x86_64
* ovirt-engine-4.4.1.10-0.1.el8ev.noarch
* does not happen on copy or cold move. Only on live move.
* only the DB volume_format seems wrong. Storage metadata is fine.
1. Create Thin Disk on NFS:
# su vdsm -s /bin/sh -c "qemu-img info ea8bcf2e-d2c1-410b-8907-38e36e765b19"
image: ea8bcf2e-d2c1-410b-8907-38e36e765b19
file format: raw
virtual size: 1 GiB (1073741824 bytes)
disk size: 4 KiB
# cat ea8bcf2e-d2c1-410b-8907-38e36e765b19.meta
CAP=1073741824
CTIME=1599886753
DESCRIPTION={"DiskAlias":"TestDisk","DiskDescription":""}
DISKTYPE=DATA
DOMAIN=d2def521-aa89-4738-aaca-5b618b97e925
FORMAT=RAW
GEN=0
IMAGE=973c50fc-7e21-4a48-a130-40acb6fa5744
LEGALITY=LEGAL
PUUID=00000000-0000-0000-0000-000000000000
TYPE=SPARSE
VOLTYPE=LEAF
EOF
engine=# select image_guid,image_group_id,size,volume_format,volume_type from images where image_group_id = '973c50fc-7e21-4a48-a130-40acb6fa5744';
image_guid | image_group_id | size | volume_format | volume_type
--------------------------------------+--------------------------------------+------------+---------------+-------------
ea8bcf2e-d2c1-410b-8907-38e36e765b19 | 973c50fc-7e21-4a48-a130-40acb6fa5744 | 1073741824 | 5 | 2
(1 row)
2. Move to block
3. After moving:
# qemu-img info /dev/729c0555-4148-4fcf-b5c9-4f07ec9f0307/ea8bcf2e-d2c1-410b-8907-38e36e765b19
image: /dev/729c0555-4148-4fcf-b5c9-4f07ec9f0307/ea8bcf2e-d2c1-410b-8907-38e36e765b19
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
# dd if=/dev/729c0555-4148-4fcf-b5c9-4f07ec9f0307/metadata bs=8k count=1 skip=129
CAP=1073741824
CTIME=1599887113
DESCRIPTION=None
DISKTYPE=DATA
DOMAIN=729c0555-4148-4fcf-b5c9-4f07ec9f0307
FORMAT=COW
GEN=1
IMAGE=973c50fc-7e21-4a48-a130-40acb6fa5744
LEGALITY=LEGAL
PUUID=00000000-0000-0000-0000-000000000000
TYPE=SPARSE
VOLTYPE=INTERNAL
EOF
engine=# select image_guid,image_group_id,size,volume_format,volume_type from images where image_group_id = '973c50fc-7e21-4a48-a130-40acb6fa5744';
image_guid | image_group_id | size | volume_format | volume_type
--------------------------------------+--------------------------------------+------------+---------------+-------------
ea8bcf2e-d2c1-410b-8907-38e36e765b19 | 973c50fc-7e21-4a48-a130-40acb6fa5744 | 1073741824 | 5 | 2
(1 row)
4. Shutdown the VM
5. Start again
6. Engine generates wrong XML for the disk, as volume_format is wrong in the Database
2020-09-12 15:14:30,134+10 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (EE-ManagedThreadFactory-engine-Thread-880) [c1ab3844-5b6b-431b-9654-4ba771434ace] VM <?xml version="1.0" encoding="UTF-8"?><domain type="kvm" xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">
...
<disk snapshot="no" type="file" device="disk">
<target dev="sda" bus="scsi"/>
<source file="/rhev/data-center/2ce9d738-dd1f-11ea-bb9a-5254000000ff/d2def521-aa89-4738-aaca-5b618b97e925/images/5359b6bc-93c6-42b1-a779-f6037d08ed47/f01f1989-70ff-4e1f-a2c2-9bc5ddeb800d">
<seclabel model="dac" type="none" relabel="no"/>
</source>
<driver name="qemu" io="threads" type="raw" error_policy="stop" cache="none"/> <------- RAW
<alias name="ua-5359b6bc-93c6-42b1-a779-f6037d08ed47"/>
<address bus="0" controller="0" unit="0" type="drive" target="0"/>
<boot order="1"/>
<serial>5359b6bc-93c6-42b1-a779-f6037d08ed47</serial>
</disk>
7. Since just the DB is wrong, the fix is relatively simple if one knows the image_guid
UPDATE images SET volume_format = '4' WHERE image_guid = 'ea8bcf2e-d2c1-410b-8907-38e36e765b19';
(In reply to Jean-Louis Dupond from comment #0) > So it seems like some entry didn't change in the database? > Also, how to fix the currect VM in this state? Thanks for reporting this Jean-Louis. In my reproducer only the database is incorrect indeed, the storage metadata is fine. Please try this in ovirt-engine, then start the VM again: $ /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE images SET volume_format = '4' WHERE image_guid = '06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1'" You should see type='qcow2' in the XML after the change above. Does it work for you as well? <disk type='block' device='disk' snapshot='no'>
<driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
<source dev='/rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/images/12f8ecc3-f1b4-42ac-814c-af422aa49512/06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1' index='3'>
<seclabel model='dac' relabel='no'/>
</source>
The db update fixed it indeed!
(In reply to Jean-Louis Dupond from comment #7) > <disk type='block' device='disk' snapshot='no'> > <driver name='qemu' type='qcow2' cache='none' error_policy='stop' > io='native'/> > <source > dev='/rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/ > images/12f8ecc3-f1b4-42ac-814c-af422aa49512/06b0a1ce-1e89-4c12-ab37- > 9b30fbb4f8e1' index='3'> > <seclabel model='dac' relabel='no'/> > </source> > > The db update fixed it indeed! Thank you for confirming! Verified with the following steps: 1. Create a VM with RAW Sparse disk on a file domain 2. Start the VM 3. Live migrate the disk to a block domain 4. Shutdown the VM 5. Start the VM Before the fix: Operation fails on booting as the volume type sent to Libvirt is wrong After the fix: Operation succeeded Version: engine-4.4.2.6-0.2 This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
Description of problem: We've did a live storage migration of a VM from NFS to iSCSI. Everything was fine, until we restarted the VM. 1 disk could not be mounted. After some debugging, I found out that from inside the VM the disk was visible as QCOW! Now I was able to reproduce this quite easly. Create a raw disk on a random VM on NFS Storage: 2020-09-10 14:43:22,030+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateImageVDSCommand] (default task-55) [a09ff0e6-ea0a-48c9-af5c-5223c42d89d9] START, CreateImageVDSCommand( CreateImageVDSCommandParameters:{storagePoolId='d497efe5-2344-4d58-8985-7b053d3c35a3', ignoreFailoverLimit='false', storageDomainId='500c30e6-efe7-4dc8-b42d-7252dd812769', imageGroupId='12f8ecc3-f1b4-42ac-814c-af422aa49512', imageSizeInBytes='53687091200', volumeFormat='RAW', newImageId='06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1', imageType='Sparse', newImageDescription='{"DiskAlias":"bugtest","DiskDescription":""}', imageInitialSizeInBytes='0'}), log id: 7546bfda Now initiage a live storage migration of that disk to a iSCSI storage: You'll get the following warning: "Block storage domain does not support disk format raw with volume type sparse. The following disks format will become qcow2: bugtest" Volume is created as QCOW on destination: 2020-09-10 14:46:02,091+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-35) [753fd06d-b91e-4e31-b737-c40a83440f33] START, CreateVolumeVDSCommand( CreateVolumeVDSCommandParameters:{storagePoolId='d497efe5-2344-4d58-8985-7b053d3c35a3', ignoreFailoverLimit='false', storageDomainId='6e99da85-8414-4ec5-92c3-b6cf741fc125', imageGroupId='12f8ecc3-f1b4-42ac-814c-af422aa49512', imageSizeInBytes='53687091200', volumeFormat='COW', newImageId='06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1', imageType='Sparse', newImageDescription='null', imageInitialSizeInBytes='53695545344', imageId='00000000-0000-0000-0000-000000000000', sourceImageGroupId='00000000-0000-0000-0000-000000000000'}), log id: 4c9f7790 When migration is done, dumpxml gives the following: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads'/> <source dev='/rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/images/12f8ecc3-f1b4-42ac-814c-af422aa49512/06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1' index='9'> <seclabel model='dac' relabel='no'/> </source> <backingStore/> <target dev='sdd' bus='scsi'/> <serial>12f8ecc3-f1b4-42ac-814c-af422aa49512</serial> <alias name='ua-12f8ecc3-f1b4-42ac-814c-af422aa49512'/> <address type='drive' controller='0' bus='0' target='0' unit='3'/> </disk> Which is correct! But now shutdown the VM, and start it: <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/images/12f8ecc3-f1b4-42ac-814c-af422aa49512/06b0a1ce-1e89-4c12-ab37-9b30fbb4f8e1' index='3'> <seclabel model='dac' relabel='no'/> </source> <backingStore/> <target dev='sda' bus='scsi'/> <serial>12f8ecc3-f1b4-42ac-814c-af422aa49512</serial> <alias name='ua-12f8ecc3-f1b4-42ac-814c-af422aa49512'/> <address type='drive' controller='0' bus='0' target='0' unit='3'/> </disk> And in the VM: # file -s /dev/sdc /dev/sdc: QEMU QCOW Image (v3), 53687091200 bytes So it seems like some entry didn't change in the database? Also, how to fix the currect VM in this state?