Bug 1741456
Summary: | Image cannot be used after blockcommit snapshots to base image and destroy/start vm | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | yisun | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
Status: | CLOSED ERRATA | QA Contact: | yisun | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 8.1 | CC: | dzheng, fjin, jdenemar, jsuchane, kchamart, lcheng, lmen, mprivozn, mtessun, toneata, xuzhang, yafu, yisun | |
Target Milestone: | rc | Keywords: | Automation, Regression, Upstream, ZStream | |
Target Release: | 8.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-5.6.0-9.el8 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1771501 (view as bug list) | Environment: | ||
Last Closed: | 2020-02-04 18:28:48 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1652078, 1771501 |
Description
yisun
2019-08-15 08:09:58 UTC
Patches posted upstream: https://www.redhat.com/archives/libvir-list/2019-August/msg01418.html Fixed upstream as: 16fb3c8b83 qemu_blockjob: Remove secdriver metadata more frequently 7f99d8a739 qemu_blockjob: Print image path on failed security metadata move too 143a0f8b05 qemu_blockjob: Move active commit failed state handling into a function v5.7.0-rc2 The comment 0 scenario 2 is still failing, and this will make the base image /var/lib/libvirt/images/jeos-27-x86_64.qcow2 dirty, which will cause more failures for followed up cases: Changed back to ASSIGNED for now. (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh start avocado-vt-vm1 'Domain avocado-vt-vm1 started (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# for i in {snap_1,snap_2}; do virsh snapshot-create-as avocado-vt-vm1 $i --disk-only; done Domain snapshot snap_1 created Domain snapshot snap_2 created (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --top vda[1] Block commit: [100 %] Commit complete (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh destroy avocado-vt-vm1; virsh start avocado-vt-vm1 Domain avocado-vt-vm1 destroyed Domain avocado-vt-vm1 started (.libvirt-ci-venv-ci-runtest-ovFFGP) [root@dell-per730-62 ~]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/jeos-27-x86_64.qcow2 which is already in use Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2019-September/msg00600.html Another approach implemented (as requested in review): https://www.redhat.com/archives/libvir-list/2019-September/msg00621.html I've just pushed the fix upstream and backported it: http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-September/msg01083.html There's also a scratch build with this patch applied: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=23705370 It also contains patches I've proposed for bug 1740024 (but those are not reviewed upstream yet). Hi Michal, The scenario 2 in comment 0 is still reproducible, pls have a check. And folloiwng is a simpler way to reproduce it: 1. having a running vm (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh domblklist avocado-vt-vm1 Target Source ------------------------------------------------------------------------ vda /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh domstate avocado-vt-vm1 running (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 <==== nothing now 2. create some external snapshots for it (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# for i in {1..2}; do virsh snapshot-create-as avocado-vt-vm1 snap_$i snap1-desc --disk-only; done Domain snapshot snap_1 created Domain snapshot snap_2 created 3. do blockcommit WITHOUT --pivot (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh blockcommit avocado-vt-vm1 vda --wait --verbose --active Block commit: [100 %] Now in synchronized phase 4. now the image file having extended attrs as follow: (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 trusted.libvirt.security.dac="+107:+107" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:svirt_image_t:s0:c229,c326" trusted.libvirt.security.timestamp_dac="1573791864" trusted.libvirt.security.timestamp_selinux="1573791864" 5. destroy the vm (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# virsh destroy avocado-vt-vm1 Domain avocado-vt-vm1 destroyed 6. even the vm stopped, the file's xattrs still exsting, and if we "virsh edit $VM" to use the original image again, vm cannot be started. (.libvirt-ci-venv-ci-runtest-tpP3NB) [root@ibm-x3850x6-03 src]# getfattr -m trusted.libvirt.security -d /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 trusted.libvirt.security.dac="+107:+107" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:svirt_image_t:s0:c229,c326" trusted.libvirt.security.timestamp_dac="1573791864" trusted.libvirt.security.timestamp_selinux="1573791864" due to above comment, I'll set this back to ASSIGNED for now, and for automation scripts, I've submitted a PR to avoid other cases blocked if failure happened https://github.com/autotest/tp-libvirt/pull/2430 Patches proposed upstream for the issue mention in comment 19: https://www.redhat.com/archives/libvir-list/2019-November/msg00851.html Pushed upstream: 8fa0374c5b qemuProcessStop: Remove image metadata for running mirror jobs 1c12b86185 qemu: Separate image metadata removal into a function Verified reproduced with auto case on libvirt-5.6.0-8.virtcov.el8.x86_64 https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/52/testReport/rhel.virsh/blockcommit/normal_test_single_chain_file_disk_local_no_ga_notimeout_nobase_top_active_without_pivot/ Fixed with auto case on libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64 https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/53/testReport/rhel.virsh/blockcommit/normal_test_single_chain_file_disk_local_no_ga_notimeout_nobase_top_active_without_pivot/ And the whole test job has no regression failures (failed cases are not related to current bz) https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-block_job_commit_pull/53/testReport/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0404 |