Description of problem: memory metadata volumes are created unaligned The details are here: https://bugzilla.redhat.com/show_bug.cgi?id=1649788#c29 Nir posted a fix here: https://gerrit.ovirt.org/q/topic:pad-vm-conf In Nir's words: " This image is a memory metadata image (DISKTYP=MEMM), holding metadata about a snapshot. $ cat 7fec9ccd-c649-4aff-9098-5953cb80f284.meta DOMAIN=8b27f44c-be74-405e-a099-39cdef2f83bc CTIME=1550493132 FORMAT=RAW DISKTYPE=MEMM LEGALITY=LEGAL SIZE=20 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"vm_TestCase5134_1814254004_snapshot_metadata","DiskDescription":"Memory snapshot disk for snapshot 'snap_TestCase5134_1814274231' of VM 'vm_TestCase5134_1814254004' (VM ID: '513ac21a-9b12-43aa-a"} IMAGE=ae6a82d9-f63e-4d8a-8c78-31483713fd0e PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=PREALLOCATED GEN=0 EOF Reproducing on Fedora 28 with virt-preview: $ qemu-img convert -f raw -O raw -p -t none -T none 7fec9ccd-c649-4aff-9098-5953cb80f284 7fec9ccd-c649-4aff-9098-5953cb80f284-copy -o preallocation=falloc qemu-img: /builddir/build/BUILD/qemu-3.1.0/block/io.c:2158: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. Aborted (core dumped) Tested with: $ qemu-img --version qemu-img version 3.1.0 (qemu-3.1.0-4.fc28) Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers Reproducing on CentOS 7.6: $ qemu-img convert -f raw -O raw -p -t none -T none 7fec9ccd-c649-4aff-9098-5953cb80f284 7fec9ccd-c649-4aff-9098-5953cb80f284-copy -o preallocation=falloc qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. Aborted (core dumped) $ qemu-img --version qemu-img version 2.12.0 (qemu-kvm-ev-2.12.0-18.el7_6.3.1) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers Regardless of the qemu-img issue, I think this is bug in RHV - this file should be padded to aligned size, I'm checking why it was not aligned. But even if we fix this in RHV, this issue may break existing RHV versions (4.2) and images created by such versions. " Version-Release number of selected component (if applicable): ovirt-engine-4.3.0.4-0.1.el7.noarch vdsm-4.30.8-2.el7ev.x86_64 qemu-img-rhev-2.12.0-21.el7.x86_64 libvirt-4.5.0-10.el7_6.4.x86_64 qemu-guest-agent-2.12.0-2.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a VM with a disk on a FILE storage domain (NFS/GLUSTER) and run it. 2. Create a snapshot from a running VM with memory state 3. After it's creation check the image metadata volumes are not aligned (meaning volume sizes are not rounded to 4K) Actual results: memory metadata volumes are created unaligned Expected results: memory metadata volumes should not be created unaligned Additional info:
Created attachment 1536028 [details] rpms_engine_vdsm
Updating severity to urgent since this breaks basic flows that try to copy vm memory metadata volumes, and it breaks automation, hiding possible regressions in the disabled tests.
Avihay, can you test the attached patches? If you can reproduce this manually, this should be very easy.
(In reply to Nir Soffer from comment #3) > Avihay, can you test the attached patches? > > If you can reproduce this manually, this should be very easy. I'll check about if this also reproduces manually. About testing this with patches, I see both fixed are in lib/vdsm/virt/vm.py. I can fix it directly on the VM host changing lib/vdsm/virt/vm.py right? Is VDSM restart required ?
Manual reproduction was achieve with scenario described above at original description: [root@storage-ge4-vdsm2 ~]# cat /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e-a099-39cdef2f83bc/images/c6fa5bf1-ec6a-4e87-95ce-d714167fee0e/c53b2ca0-086f-4b18-bb51-3097e110e7a7.meta DOMAIN=8b27f44c-be74-405e-a099-39cdef2f83bc CTIME=1550566515 FORMAT=RAW DISKTYPE=MEMM LEGALITY=LEGAL SIZE=20 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"vm1_snapshot_metadata","DiskDescription":"Memory snapshot disk for snapshot 's1' of VM 'vm1' (VM ID: '70c4e34c-8171-49e1-a195-3a0372320177')"} IMAGE=c6fa5bf1-ec6a-4e87-95ce-d714167fee0e PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=PREALLOCATED GEN=0 EOF [root@storage-ge4-vdsm2 ~]# qemu-img convert -f raw -O raw -p -t none -T none c53b2ca0-086f-4b18-bb51-3097e110e7a7.meta 7fec9ccd-c649-4aff-9098-5953cb80f284-copy -o preallocation=falloc qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. Aborted
I changed lib/vdsm/virt/vm.py directly on the host with the changes in the fix patches and restarted VDSM, will try the same scenario again and report
I created another memory snapshot on the same VM and checked the metadata memory disk and the issue is still there meaning metadata file is still not 4K aligned and qemu-img convert fails after the fix. Please connect to host "storage-ge4-vdsm2.scl.lab.tlv.redhat.com" and check it out for yourself to see if I missed something. Changed code at host file /usr/lib/python2.7/site-packages/vdsm/virt/vm.py starting at line 4440:: " if memoryParams: # Save the needed vm configuration # TODO: this, as other places that use pickle.dump # directly to files, should be done with outOfProcess vmConfVol = memoryParams['dstparams'] vmConf = _vmConfForMemorySnapshot() vmConfVolPath = self.cif.prepareVolumePath(vmConfVol) try: with open(vmConfVolPath, "rb+") as f: data = pickle.dumps(vmConf) # Ensure that the volume is aligned; qemu-img may segfault # when converting unligned images. # https://bugzilla.redhat.com/1649788 aligned_length = utils.round(len(data), 4096) data.ljust(aligned_length) f.write(data) f.flush() os.fsync(f.fileno()) finally: " After Fix: [root@storage-ge4-vdsm2 ~]# ls -ltr /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e-a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/ total 1040 -rw-r--r--. 1 vdsm kvm 424 Feb 19 11:18 94325987-b074-4368-8c8c-b5b13a7b67d8.meta -rw-rw----. 1 vdsm kvm 1048576 Feb 19 11:18 94325987-b074-4368-8c8c-b5b13a7b67d8.lease -rw-rw----. 1 vdsm kvm 10240 Feb 19 11:18 94325987-b074-4368-8c8c-b5b13a7b67d8 [root@storage-ge4-vdsm2 ~]# cat /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e-a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/94325987-b074-4368-8c8c-b5b13a7b67d8.meta DOMAIN=8b27f44c-be74-405e-a099-39cdef2f83bc CTIME=1550567896 FORMAT=RAW DISKTYPE=MEMM LEGALITY=LEGAL SIZE=20 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"vm1_snapshot_metadata","DiskDescription":"Memory snapshot disk for snapshot 's2' of VM 'vm1' (VM ID: '70c4e34c-8171-49e1-a195-3a0372320177')"} IMAGE=65dfdc77-a692-488a-8f6a-7cfc9beeceda PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=PREALLOCATED GEN=0 EOF [root@storage-ge4-vdsm2 ~]# qemu-img convert -f raw -O raw -p -t none -T none 94325987-b074-4368-8c8c-b5b13a7b67d8.meta 94325987-b074-4368-8c8c-b5b13a7b67d8.meta-copy -o preallocation=falloc qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. Aborted
(In reply to Avihai from comment #7) > I created another memory snapshot on the same VM and checked the metadata > memory disk and the issue is still there meaning metadata file is still not > 4K aligned and qemu-img convert fails after the fix. The image is aligned to 512 bytes since engine creates this image size (10240) and the vm you use does not have the same configuration as in the tests, so you did not reproduce the issue when vm conf data is larger than 10240, but not aligned. If this was the case, the image would be aligned to 4096. To ensure that the image is always aligned to 4096, we need to fix engine to create an image aligned to 4096. However this fix should be good enough for current system which does not support yet 4k drives. > After Fix: > [root@storage-ge4-vdsm2 ~]# ls -ltr > /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e- > a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/ > total 1040 > -rw-r--r--. 1 vdsm kvm 424 Feb 19 11:18 > 94325987-b074-4368-8c8c-b5b13a7b67d8.meta > -rw-rw----. 1 vdsm kvm 1048576 Feb 19 11:18 > 94325987-b074-4368-8c8c-b5b13a7b67d8.lease > -rw-rw----. 1 vdsm kvm 10240 Feb 19 11:18 > 94325987-b074-4368-8c8c-b5b13a7b67d8 This is the image file which should be aligned, and it is aligned to 512 bytes (10240 is 20 * 512), so qemu should be eble to copy it. You need to use the same vm configuration as in automation to reproduce the case when vm conf is larger than 10240. > [root@storage-ge4-vdsm2 ~]# cat > /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e- > a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/94325987-b074- > 4368-8c8c-b5b13a7b67d8.meta > DOMAIN=8b27f44c-be74-405e-a099-39cdef2f83bc > CTIME=1550567896 > FORMAT=RAW > DISKTYPE=MEMM > LEGALITY=LEGAL > SIZE=20 > VOLTYPE=LEAF > DESCRIPTION={"DiskAlias":"vm1_snapshot_metadata","DiskDescription":"Memory > snapshot disk for snapshot 's2' of VM 'vm1' (VM ID: > '70c4e34c-8171-49e1-a195-3a0372320177')"} > IMAGE=65dfdc77-a692-488a-8f6a-7cfc9beeceda > PUUID=00000000-0000-0000-0000-000000000000 > MTIME=0 > POOL_UUID= > TYPE=PREALLOCATED > GEN=0 > EOF This looks correct, this is indeed a memory metadata volume... > [root@storage-ge4-vdsm2 ~]# qemu-img convert -f raw -O raw -p -t none -T > none 94325987-b074-4368-8c8c-b5b13a7b67d8.meta > 94325987-b074-4368-8c8c-b5b13a7b67d8.meta-copy -o preallocation=falloc > qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && > (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. > Aborted But this is wrong, we never copy the .meta files - these are metadata files handled by vdsm. When we copy a volume we create new metadata on the destination, and copy the data file (one without .meta) using qemu. But to reproduce this issue you don't need to run qemu manually, just do the same operation engine (either manually or using REST API/SDK) that reproduces the failure in the test. After this fix, the same operation that failed before should pass.
(In reply to Nir Soffer from comment #8) > (In reply to Avihai from comment #7) > > I created another memory snapshot on the same VM and checked the metadata > > memory disk and the issue is still there meaning metadata file is still not > > 4K aligned and qemu-img convert fails after the fix. > > The image is aligned to 512 bytes since engine creates this image size > (10240) > and the vm you use does not have the same configuration as in the tests, so > you > did not reproduce the issue when vm conf data is larger than 10240, but not > aligned. > If this was the case, the image would be aligned to 4096. > > To ensure that the image is always aligned to 4096, we need to fix engine to > create an image aligned to 4096. > > However this fix should be good enough for current system which does not > support > yet 4k drives. > > > After Fix: > > [root@storage-ge4-vdsm2 ~]# ls -ltr > > /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e- > > a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/ > > total 1040 > > -rw-r--r--. 1 vdsm kvm 424 Feb 19 11:18 > > 94325987-b074-4368-8c8c-b5b13a7b67d8.meta > > -rw-rw----. 1 vdsm kvm 1048576 Feb 19 11:18 > > 94325987-b074-4368-8c8c-b5b13a7b67d8.lease > > -rw-rw----. 1 vdsm kvm 10240 Feb 19 11:18 > > 94325987-b074-4368-8c8c-b5b13a7b67d8 > > This is the image file which should be aligned, and it is aligned to 512 > bytes > (10240 is 20 * 512), so qemu should be eble to copy it. > > You need to use the same vm configuration as in automation to reproduce the > case > when vm conf is larger than 10240. > > > [root@storage-ge4-vdsm2 ~]# cat > > /rhev/data-center/0666bee6-782b-4775-8683-93209de805db/8b27f44c-be74-405e- > > a099-39cdef2f83bc/images/65dfdc77-a692-488a-8f6a-7cfc9beeceda/94325987-b074- > > 4368-8c8c-b5b13a7b67d8.meta > > DOMAIN=8b27f44c-be74-405e-a099-39cdef2f83bc > > CTIME=1550567896 > > FORMAT=RAW > > DISKTYPE=MEMM > > LEGALITY=LEGAL > > SIZE=20 > > VOLTYPE=LEAF > > DESCRIPTION={"DiskAlias":"vm1_snapshot_metadata","DiskDescription":"Memory > > snapshot disk for snapshot 's2' of VM 'vm1' (VM ID: > > '70c4e34c-8171-49e1-a195-3a0372320177')"} > > IMAGE=65dfdc77-a692-488a-8f6a-7cfc9beeceda > > PUUID=00000000-0000-0000-0000-000000000000 > > MTIME=0 > > POOL_UUID= > > TYPE=PREALLOCATED > > GEN=0 > > EOF > > This looks correct, this is indeed a memory metadata volume... > > > [root@storage-ge4-vdsm2 ~]# qemu-img convert -f raw -O raw -p -t none -T > > none 94325987-b074-4368-8c8c-b5b13a7b67d8.meta > > 94325987-b074-4368-8c8c-b5b13a7b67d8.meta-copy -o preallocation=falloc > > qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && > > (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. > > Aborted > > But this is wrong, we never copy the .meta files - these are metadata files > handled > by vdsm. When we copy a volume we create new metadata on the destination, > and copy > the data file (one without .meta) using qemu. > > But to reproduce this issue you don't need to run qemu manually, just do the > same operation engine (either manually or using REST API/SDK) that reproduces > the failure in the test. > > After this fix, the same operation that failed before should pass. Now I get it, we check the snapshot metadata image itself is aligned to 512B and not it's .meta file. I'll retest with automation test that reproduces the issue and update.
Avishay, in the failing automation tests, what is the failing operation? create template from vm? export vm? I want reproduce this on my system.
Avihay, there was bug in the previous patch. With new version of the patch, I verified that creating a memory snapshot create aligned metadata volume. I created a vm with 4 disks and a vm lease, to make sure the vm xml is big enough to to overflow 10k. Then I created a snapshot with memory. Here is the memory volume: $ grep DISKTYPE=MEMM /rhev/data-center/mnt/dumbo.tlv.redhat.com\:_voodoo_v43-01/08272182-9fb1-4609-bd3b-0246b66eafa3/images/*/*.meta /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_v43-01/08272182-9fb1-4609-bd3b-0246b66eafa3/images/a5528557-1e27-4448-b42a-a36ba1ed1e1c/89de52e4-5691-4f9d-8c45-7eaf81ea448d.meta:DISKTYPE=MEMM $ ls -l /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_v43-01/08272182-9fb1-4609-bd3b-0246b66eafa3/images/a5528557-1e27-4448-b42a-a36ba1ed1e1c total 1044 -rw-rw----. 1 vdsm kvm 16384 Feb 19 19:52 89de52e4-5691-4f9d-8c45-7eaf81ea448d -rw-rw----. 1 vdsm kvm 1048576 Feb 19 19:52 89de52e4-5691-4f9d-8c45-7eaf81ea448d.lease -rw-r--r--. 1 vdsm kvm 443 Feb 19 19:52 89de52e4-5691-4f9d-8c45-7eaf81ea448d.meta You can see that the volume size is aligned to 4k. With this fix the failing tests should pass. I cannot verify this because there is no info in this bug or in the qemu bug what was the engine flow triggering copyImage in vdsm. If you want to verify that this change fixes the failing automated tests please use the rpms from https://jenkins.ovirt.org/job/vdsm_standard-check-patch/3037/ Changing code manually on the host is not a good idea for verifying a fix.
(In reply to Nir Soffer from comment #11) > Avishay, in the failing automation tests, what is the failing operation? > create template from vm? export vm? I want reproduce this on my system. Export VM caused the issue, see full automation test scenario below. Full scenario from Bug 1649788: Steps to Reproduce: 12:13:59 2018-11-14 12:13:59,088 INFO Test Setup 1: Creating VM vm_TestCase5134_1412135908 12:14:23 2018-11-14 12:14:22,988 INFO Test Setup 2: Creating snapshot snap_TestCase5134_1412142298 of VM vm_TestCase5134_1412135908 12:14:23 2018-11-14 12:14:22,991 INFO Test Setup 3: [class] Add snapshot to VM vm_TestCase5134_1412135908 with {'persist_memory': False, 'description': 'snap_TestCase5134_1412142298', 'wait': True} 12:14:39 2018-11-14 12:14:39,100 INFO Test Setup 4: Starting VM vm_TestCase5134_1412135908 12:14:39 2018-11-14 12:14:39,103 INFO Test Setup 5: [class] Start VM vm_TestCase5134_1412135908 with {'wait_for_ip': False, 'pause': False, 'use_cloud_init': False, 'timeout': 600, 'wait_for_status': 'up'} 12:15:50 2018-11-14 12:15:50,673 INFO Test Step 6: Starting cat process on VM vm_TestCase5134_1412135908 12:24:27 2018-11-14 12:24:27,576 INFO Test Step 7: Creating snapshot snap_TestCase5134_1412155067 with RAM state 12:24:27 2018-11-14 12:24:27,580 INFO Test Setup 8: [function] Add snapshot to VM vm_TestCase5134_1412135908 with {'persist_memory': True, 'description': 'snap_TestCase5134_1412155067', 'wait': True} 12:24:59 2018-11-14 12:24:58,990 INFO Test Setup 9: Power off VM vm_TestCase5134_1412135908 12:24:59 2018-11-14 12:24:59,163 INFO Test Setup 10: [function] Stop vm vm_TestCase5134_1412135908 with {'async': 'true'} 12:25:02 2018-11-14 12:25:02,876 INFO 002: storage/rhevmtests.storage.storage_snapshots.test_ram_snapshot.TestCase5134.test_import_vm_with_memory_state_snapshot[nfs] 12:25:02 2018-11-14 12:25:02,876 INFO Import a VM that has memory state snapshot and ensure it resumes memory 12:25:02 2018-11-14 12:25:02,876 INFO state from that snapshot successfully 12:25:02 2018-11-14 12:25:02,877 INFO STORAGE: NFS 12:25:02 2018-11-14 12:25:02,877 INFO Test Step 11: Exporting VM vm_TestCase5134_1412135908 to domain export_domain 12:25:02 2018-11-14 12:25:02,879 INFO Test Step 12: Export vm vm_TestCase5134_1412135908 to export storage domain with {'exclusive': 'false', 'storagedomain': 'export_domain', 'discard_snapshots': 'false', 'timeout': 600, 'async': False} 12:25:18 2018-11-14 12:25:18,035 ERROR Result: FAILED
(In reply to Nir Soffer from comment #12) > Avihay, there was bug in the previous patch. With new version of the patch, > I verified that creating a memory snapshot create aligned metadata volume. > > I created a vm with 4 disks and a vm lease, to make sure the vm xml is big > enough to to overflow 10k. > > Then I created a snapshot with memory. Here is the memory volume: > > $ grep DISKTYPE=MEMM > /rhev/data-center/mnt/dumbo.tlv.redhat.com\:_voodoo_v43-01/08272182-9fb1- > 4609-bd3b-0246b66eafa3/images/*/*.meta > /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_v43-01/08272182-9fb1-4609- > bd3b-0246b66eafa3/images/a5528557-1e27-4448-b42a-a36ba1ed1e1c/89de52e4-5691- > 4f9d-8c45-7eaf81ea448d.meta:DISKTYPE=MEMM > > $ ls -l > /rhev/data-center/mnt/dumbo.tlv.redhat.com:_voodoo_v43-01/08272182-9fb1-4609- > bd3b-0246b66eafa3/images/a5528557-1e27-4448-b42a-a36ba1ed1e1c > total 1044 > -rw-rw----. 1 vdsm kvm 16384 Feb 19 19:52 > 89de52e4-5691-4f9d-8c45-7eaf81ea448d > -rw-rw----. 1 vdsm kvm 1048576 Feb 19 19:52 > 89de52e4-5691-4f9d-8c45-7eaf81ea448d.lease > -rw-r--r--. 1 vdsm kvm 443 Feb 19 19:52 > 89de52e4-5691-4f9d-8c45-7eaf81ea448d.meta > > You can see that the volume size is aligned to 4k. > > With this fix the failing tests should pass. I cannot verify this because > there is no info in this bug or in the qemu bug what was the engine flow > triggering copyImage in vdsm. > > If you want to verify that this change fixes the failing automated tests > please use the rpms from > https://jenkins.ovirt.org/job/vdsm_standard-check-patch/3037/ > Changing code manually on the host is not a good idea for verifying a fix. I totally agree. Please merge it to master so we can test it on upstream. We do not have the capacity to build and check specific patches.
This should work if you test with master, both patches merged. The third patch which is not merged yet improves the way we align images, but it should not effect the behavior for copying images.
(In reply to Nir Soffer from comment #15) > This should work if you test with master, both patches merged. > > The third patch which is not merged yet improves the way we align images, > but it should not effect the behavior for copying images. For verification I need all patches merged and the bug to be on 'MODIFIED' or 'ON_QA'. Please notify me when all patches are in there(4.3 upstream or downstream) so I can verify.
Avihay, all patches are merged, they will be included in the next build (4.30.10).
Nir, it is still reproducible. [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# rpm -qa | grep vdsm vdsm-hook-vhostmd-4.30.10-1.el7ev.noarch vdsm-api-4.30.10-1.el7ev.noarch vdsm-network-4.30.10-1.el7ev.x86_64 vdsm-jsonrpc-4.30.10-1.el7ev.noarch vdsm-4.30.10-1.el7ev.x86_64 vdsm-hook-fcoe-4.30.10-1.el7ev.noarch vdsm-python-4.30.10-1.el7ev.noarch vdsm-common-4.30.10-1.el7ev.noarch vdsm-client-4.30.10-1.el7ev.noarch vdsm-hook-openstacknet-4.30.10-1.el7ev.noarch vdsm-http-4.30.10-1.el7ev.noarch vdsm-hook-ethtool-options-4.30.10-1.el7ev.noarch vdsm-yajsonrpc-4.30.10-1.el7ev.noarch vdsm-hook-vmfex-dev-4.30.10-1.el7ev.noarch [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# cat ^C [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# ls -ltr total 1048 -rw-r--r--. 1 vdsm kvm 415 Mar 7 17:13 770dc881-409a-4022-abe1-e99d5f5c9745.meta -rw-rw----. 1 vdsm kvm 1048576 Mar 7 17:13 770dc881-409a-4022-abe1-e99d5f5c9745.lease -rw-rw----. 1 vdsm kvm 12288 Mar 7 17:13 770dc881-409a-4022-abe1-e99d5f5c9745 -rw-r--r--. 1 root root 415 Mar 7 17:43 770dc881-409a-4022-abe1-e99d5f5c9745.copy -rw-r--r--. 1 root root 512 Mar 7 17:43 770dc881-409a-4022-abe1-e99d5f5c9745.copy2 [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# cat 770dc881-409a-4022-abe1-e99d5f5c9745.meta DOMAIN=1cfac769-8fc7-4cc3-acc4-cb1f7e45fd54 CTIME=1551971581 FORMAT=RAW DISKTYPE=MEMM LEGALITY=LEGAL SIZE=20 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"b1678373_snapshot_metadata","DiskDescription":"Memory snapshot disk for snapshot 's1' of VM 'b1678373' (VM ID: 'a2320396-6318-4625-b76b-1a4569424266')"} IMAGE=cd091fc1-207a-494b-9a8a-fab5f260daeb PUUID=00000000-0000-0000-0000-000000000000 TYPE=PREALLOCATED GEN=0 EOF [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# qemu-img --version qemu-img version 2.12.0 (qemu-kvm-rhev-2.12.0-21.el7) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# qemu-img convert -f raw -O raw -p -t none -T none 770dc881-409a-4022-abe1-e99d5f5c9745.copy 770dc881-409a-4022-abe1-e99d5f5c9745.copy2 -o preallocation=falloc qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. Aborted (core dumped) What am I missing?
> [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# ls -ltr > total 1048 > -rw-r--r--. 1 vdsm kvm 415 Mar 7 17:13 > 770dc881-409a-4022-abe1-e99d5f5c9745.meta > -rw-rw----. 1 vdsm kvm 1048576 Mar 7 17:13 > 770dc881-409a-4022-abe1-e99d5f5c9745.lease > -rw-rw----. 1 vdsm kvm 12288 Mar 7 17:13 > 770dc881-409a-4022-abe1-e99d5f5c9745 > -rw-r--r--. 1 root root 415 Mar 7 17:43 > 770dc881-409a-4022-abe1-e99d5f5c9745.copy > -rw-r--r--. 1 root root 512 Mar 7 17:43 > 770dc881-409a-4022-abe1-e99d5f5c9745.copy2 770dc881-409a-4022-abe1-e99d5f5c9745.copy is a copy of 770dc881-409a-4022-abe1-e99d5f5c9745.meta. This is vdsm private data file that is never copied with qemu-img... > [root@storage-ge8-vdsm1 cd091fc1-207a-494b-9a8a-fab5f260daeb]# qemu-img > convert -f raw -O raw -p -t none -T none > 770dc881-409a-4022-abe1-e99d5f5c9745.copy > 770dc881-409a-4022-abe1-e99d5f5c9745.copy2 -o preallocation=falloc > qemu-img: block/io.c:2134: bdrv_co_block_status: Assertion `*pnum && > (((*pnum) % (align)) == 0) && align > offset - aligned_offset' failed. > Aborted (core dumped) So you should not try to copy it with qemu. To very this bug, you need to: - Reproduce with older version, by creating a VM with enough disks that creating a memory snapshot creates a memory metadata volume (MEMF) bigger than 10k. - Then export the VM to export domain (Avihay said this is the flow reproduing this issue) - Repeat the same with latest build, the operation should succeed. Checking manually is error prone, best avoid it. I think you have disabled automated tests that used to fail because of this issue. If you have now environment with the new build, these tests should work now.
Verified on vdsm 4.30.10 engine 4.3.2-0.
This bugzilla is included in oVirt 4.3.2 release, published on March 19th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.