Bug 1781079
| Summary: | [blockdev enablement]VM cannot be started if we took external sanpshot and destroyed it | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | yisun | ||||
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
| Status: | CLOSED ERRATA | QA Contact: | yisun | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 8.2 | CC: | hhan, jdenemar, lmen, pkrempa, xuzhang, yisun | ||||
| Target Milestone: | rc | Keywords: | Automation, Regression | ||||
| Target Release: | 8.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-6.0.0-1.el8 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-05-05 09:52:05 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
yisun
2019-12-09 10:08:30 UTC
The problem is that libvirt didn't set the original image struct to read-only when creating the snapshot and the generated commandline then contained readonly set to 'false' in the qcow2 layer's blockdev parameters:
-blockdev '{"driver":"file","filename":"/var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}'
-blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null}'
Since the backing file's access is restricted by selinux, qemu failed opening it in read-write mode.
Fixed upstream:
commit 6f6a1763a1c227b7b5d92ec813c02ce1b26b10a2
Author: Peter Krempa <pkrempa>
Date: Mon Dec 9 12:44:41 2019 +0100
qemu: snapshot: Mark file becoming backingStore as read-only
For any backing file we set 'read-only' to true, but didn't do this when
modifying the recorded backing store when creating external snapshots.
This meant that qemu would attempt to open the backing-file read-write.
This would fail for example when selinux is used as qemu doesn't have
write permission for the backing file.
v5.10.0-118-g6f6a1763a1
Hi Peter,
The auto case still failed with latest libvirt, with same steps, could you pls check if code lost when libvirt rebased? thx. Put back to Assigned for now.
(.libvirt-ci-venv-ci-runtest-jcaFve) [root@dell-per730-67 ~]# rpm -qa | grep libvirt-6
libvirt-6.0.0-1.module+el8.2.0+5453+31b2b136.x86_64
python3-libvirt-6.0.0-1.module+el8.2.0+5453+31b2b136.x86_64
1. Auto case failed
(.libvirt-ci-venv-ci-runtest-jcaFve) [root@dell-per730-67 ~]# avocado run --vt-type libvirt blockcommit.normal_test.multiple_chain.file_disk.local.no_ga.notimeout.shallow.top_active.without_pivot
JOB ID : 894d75587a6368792dcb24bfbe444f1a97f2393d
JOB LOG : /root/avocado/job-results/job-2020-01-19T07.37-894d755/job.log
(1/1) type_specific.io-github-autotest-libvirt.virsh.blockcommit.normal_test.multiple_chain.file_disk.local.no_ga.notimeout.shallow.top_active.without_pivot: ERROR: VM 'avocado-vt-vm1' failed to start: error: Failed to start domain avocado-vt-vm1\nerror: internal error: process exited while connecting to monitor: 2020-01-19T12:38:21.857667Z qemu-kvm: -blockdev {"node-name":"libvirt-1-format","read-only":false,"driver"... (24.66 s)
RESULTS : PASS 0 | ERROR 1 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME : 26.12 s
2. manually reproduced:
# virsh snapshot-create-as avocado-vt-vm1 snap1 --disk-only
Domain snapshot snap1 created
# virsh destroy avocado-vt-vm1
Domain avocado-vt-vm1 destroyed
# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: internal error: process exited while connecting to monitor: 2020-01-19T12:42:45.179232Z qemu-kvm: -blockdev {"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}: Could not reopen file: Permission denied
I can't reproduce this issue. Could you please attach debug logs and the domain XML before and after the snapshot please? (In reply to Peter Krempa from comment #6) > I can't reproduce this issue. Could you please attach debug logs and the > domain XML before and after the snapshot please? hmm, thx for the tips and finally find out why the auto case still failed. The script created 3 snapshots: snap3 -> snap2 -> snap1 -> base_image and then script removed the vm's disk xml and prepare a new disk xml based on the removed part. But when prepare the new disk xml, the script misused snap1 as top snapshot. And since vm's inactive xml has full backing chain info now, the script prepare a weird disk xml as follow: <disk device="disk" type="file"> <driver name="qemu" type="qcow2" /> <source file="/var/tmp/avocado_ownj7rss/jeos-27-x86_64.snap1" /> <backingStore type="file"> <format type="qcow2" /> <source file="/var/tmp/avocado_ownj7rss/jeos-27-x86_64.snap2" /> <backingStore type="file"> <format type="qcow2" /> <source file="/var/tmp/avocado_ownj7rss/jeos-27-x86_64.snap1" /> <backingStore type="file"> <format type="qcow2" /> <source file="/var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2" /> <backingStore /> </backingStore> </backingStore> </backingStore> <target bus="virtio" dev="vda" /> <address bus="0x04" domain="0x0000" function="0x0" slot="0x00" type="pci" /> </disk> which is snap1 -> snap2 -> snap1 -> base_img now. Here the snap1 used twice in the chain, and problem happens. So this shoud be fixed in auto script, sorry for the false alarm since the error message is so similar. Will close this bug as verified. Thank you for looking into it! I wouldn't be able to figure out what's happening in the test suite so quickly. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017 |