Image cannot be used after blockcommit snapshots to base image and destroy/start vm Versions: libvirt-5.6.0-1.module+el8.1.0+3890+4d3d259c.x86_64 qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.x86_64 How reproducible: 100% Scenario 1: Blockcommit from top to base with --active and without --pivot and restart vm 1. Having a running vm with a virtual disk = vda $ virsh start avocado-vt-vm1 Domain avocado-vt-vm1 started $ virsh domblklist avocado-vt-vm1 Target Source ---------------------------------------------------------------- vda /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 2. Create a snapshot for the vm $ virsh snapshot-create-as avocado-vt-vm1 snap_1 --disk-only Domain snapshot snap_1 created 3. Do blockcommit to merge the snapshot to base image $ virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active Block commit: [100 %] Now in synchronized phase $ virsh blockjob avocado-vt-vm1 vda Active Block Commit: [100 %] 4. Abort the block job ( this step is optional, can be skipped. Without this step, the vm will be restarted with a active block job) $ virsh blockjob avocado-vt-vm1 vda --abort $ virsh blockjob avocado-vt-vm1 vda No current block job for vda 5. Restart the vm $ virsh destroy avocado-vt-vm1; virsh start avocado-vt-vm1 Domain avocado-vt-vm1 destroyed Domain avocado-vt-vm1 started 6. Now the base image cannot be used again. We can not do another blockcommit, or use it directly in this vm. $ virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 which is already in use Scenario 2: Blockcommit form middle to base and restart vm Following is the scenario used in our auto case. 1. Having a running vm with a virtual disk = vda $ virsh start avocado-vt-vm1 Domain avocado-vt-vm1 started $ virsh domblklist avocado-vt-vm1 Target Source ---------------------------------------------------------------- vda /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 2. Create 2 disk-only snapshots $ for i in {snap_1,snap_2}; do virsh snapshot-create-as avocado-vt-vm1 $i --disk-only; done Domain snapshot snap_1 created Domain snapshot snap_2 created 3. Do blockcommit from middle image to base image $ virsh blockcommit avocado-vt-vm1 vda --wait --verbose --top vda[1] Block commit: [100 %] Commit complete 4. Destroy and start the vm $ virsh destroy avocado-vt-vm1; virsh start avocado-vt-vm1 Domain avocado-vt-vm1 destroyed Domain avocado-vt-vm1 started 5. 5.1 Try to do a blockcommit to merge everything to base image. $ virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 which is already in use 5.2 virsh edit the vm to use /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 as source file of vda and start vm $ virsh dumpxml avocado-vt-vm1 | awk '/<disk/,/<\/disk/' <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </disk> $ virsh start avocado-vt-vm1 error: Failed to start domain avocado-vt-vm1 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1-x86_64-latest.qcow2 which is already in use Expected result: Original img should have no trouble to be used in scenario1_step6 and scenario2_step5
Patches posted upstream: https://www.redhat.com/archives/libvir-list/2019-August/msg01418.html
v2: https://www.redhat.com/archives/libvir-list/2019-August/msg01433.html
Fixed upstream as: 16fb3c8b83 qemu_blockjob: Remove secdriver metadata more frequently 7f99d8a739 qemu_blockjob: Print image path on failed security metadata move too 143a0f8b05 qemu_blockjob: Move active commit failed state handling into a function v5.7.0-rc2
To POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-September/msg00025.html
The comment 0 scenario 2 is still failing, and this will make the base image /var/lib/libvirt/images/jeos-27-x86_64.qcow2 dirty, which will cause more failures for followed up cases: Changed back to ASSIGNED for now. (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh start avocado-vt-vm1 'Domain avocado-vt-vm1 started (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# for i in {snap_1,snap_2}; do virsh snapshot-create-as avocado-vt-vm1 $i --disk-only; done Domain snapshot snap_1 created Domain snapshot snap_2 created (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --top vda[1] Block commit: [100 %] Commit complete (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh destroy avocado-vt-vm1; virsh start avocado-vt-vm1 Domain avocado-vt-vm1 destroyed Domain avocado-vt-vm1 started (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/jeos-27-x86_64.qcow2 which is already in use
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2019-September/msg00600.html
Another approach implemented (as requested in review): https://www.redhat.com/archives/libvir-list/2019-September/msg00621.html
I've just pushed the fix upstream and backported it: http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-September/msg01083.html There's also a scratch build with this patch applied: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=23705370 It also contains patches I've proposed for bug 1740024 (but those are not reviewed upstream yet).
Hi Michal, The scenario 2 in comment 0 is still reproducible, pls have a check. And folloiwng is a simpler way to reproduce it: 1. having a running vm (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh domblklist avocado-vt-vm1 Target Source ------------------------------------------------------------------------ vda /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh domstate avocado-vt-vm1 running (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 <==== nothing now 2. create some external snapshots for it (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# for i in {1..2}; do virsh snapshot-create-as avocado-vt-vm1 snap_$i snap1-desc --disk-only; done Domain snapshot snap_1 created Domain snapshot snap_2 created 3. do blockcommit WITHOUT --pivot (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active Block commit: [100 %] Now in synchronized phase 4. now the image file having extended attrs as follow: (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 trusted.libvirt.security.dac="+107:+107" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:svirt_image_t:s0:c229,c326" trusted.libvirt.security.timestamp_dac="1573791864" trusted.libvirt.security.timestamp_selinux="1573791864" 5. destroy the vm (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh destroy avocado-vt-vm1 Domain avocado-vt-vm1 destroyed 6. even the vm stopped, the file's xattrs still exsting, and if we "virsh edit $VM" to use the original image again, vm cannot be started. (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 trusted.libvirt.security.dac="+107:+107" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:svirt_image_t:s0:c229,c326" trusted.libvirt.security.timestamp_dac="1573791864" trusted.libvirt.security.timestamp_selinux="1573791864"
due to above comment, I'll set this back to ASSIGNED for now, and for automation scripts, I've submitted a PR to avoid other cases blocked if failure happened https://github.com/autotest/tp-libvirt/pull/2430
Patches proposed upstream for the issue mention in comment 19: https://www.redhat.com/archives/libvir-list/2019-November/msg00851.html
Pushed upstream: 8fa0374c5b qemuProcessStop: Remove image metadata for running mirror jobs 1c12b86185 qemu: Separate image metadata removal into a function
To POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-November/msg00911.html
Verified reproduced with auto case on libvirt-5.6.0-8.virtcov.el8.x86_64 https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/52/testReport/rhel.virsh/blockcommit/normal_test_single_chain_file_disk_local_no_ga_notimeout_nobase_top_active_without_pivot/ Fixed with auto case on libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64 https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/53/testReport/rhel.virsh/blockcommit/normal_test_single_chain_file_disk_local_no_ga_notimeout_nobase_top_active_without_pivot/ And the whole test job has no regression failures (failed cases are not related to current bz) https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/53/testReport/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0404