Bug 1526313
Summary: | Improve QEMU lock error info for hot-plugging same qcow2 image file twice in different target to VM | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | jiyan <jiyan> |
Component: | qemu-kvm-rhev | Assignee: | Hanna Czenczek <hreitz> |
Status: | CLOSED ERRATA | QA Contact: | Tingting Mao <timao> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.5 | CC: | aliang, chayang, coli, dyuan, guillaume.pavese, hhan, jiyan, juzhang, knoel, lmen, meili, mtessun, ngu, pkrempa, qzhang, virt-maint, xuwei, xuzhang, yhong, yisun |
Target Milestone: | rc | Keywords: | TestOnly |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-22 09:18:46 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1551486 | ||
Bug Blocks: | 1609654 |
Description
jiyan
2017-12-15 08:11:48 UTC
Moving to qemu. These errors are reported by qemu from the locking infrastructure and it isn't practical for libvirt to reimplement everything. Libvirt's locking would report a proper error in the second case, but the first case is okay from libvirt's POV. Also note that the error may be weird due to the downstream command __com.redhat_drive_add, while blockdev-add may report proper error. So, when image locking failed, the error message on the command line and HMP are more informative (there is this hint message "Is another process using the image?" after the error) than QMP because QMP error info doesn't include the "hint" info. We could add "hint" key to the returned error object in QMP. This also means some work on libvirt to return it to the user. The easier alternative is to squash that sentence into the error message and be done. Meanwhile I'm posting a middle ground solution to upstream as RFC: always include "hint" messages in QMP error texts, which has the advantages of both sides: include hint in QMP errors generically so all errors can benefit, and no libvirt change is necessary. Hi, Fam. I can not create external snapshot after raising the error info as this bug describes, could you please help to check whether it is same issue or another issue related to snapshot? Thanks in advance. Version: kernel-3.10.0-860.el7.x86_64 qemu-kvm-rhev-2.10.0-21.el7.x86_64 libvirt-3.9.0-14.virtcov.el7.x86_64 Steps: 1. Prepare a shutdown VM with bootable source file as following: # virsh domstate test4 shut off # virsh dumpxml test4 |grep "<disk" -A5 <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2'/> <target dev='hda' bus='ide'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> 2. Prepare a xml file, the source file is same with Step-1. # cat test4.xml <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2'/> <target dev='hdb' bus='virtio'/> </disk> 3. Start VM and hot-plug, the error will raise # virsh start test4 Domain test4 started # virsh attach-device test4 test4.xml error: Failed to attach device from test4.xml error: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized 4. Try to create external snapshot, it will fail # virsh snapshot-create-as test4 t4s1 --disk-only --diskspec hda,snapshot=external,file=/tmp/t4s1 error: internal error: unable to execute QEMU command 'transaction': Failed to lock byte 100 # virsh snapshot-create-as test4 t4s1 --disk-only --diskspec hda,snapshot=external,file=/tmp/t4s1 error: internal error: unable to execute QEMU command 'transaction': Failed to lock byte 100 # virsh snapshot-create-as test4 t4s1 --disk-only --diskspec hda,snapshot=external,file=/tmp/t4s1 error: internal error: unable to execute QEMU command 'transaction': Failed to lock byte 100 I find that disabling SELinux will fix the issue. My guess is that the error handling logic in libvirt triggered in step 3 has misconfigured the context of the image or something, as a result, a following locking operation when QEMU tries to create the snapshot is rejected by system security policy: [pid 9863] fcntl(17, F_OFD_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=100, l_len=1}) = -1 EACCES (Permission denied) I think it's a bug in libvirt, please file a separate BZ. Need to wait for blockdev-add infrastructure to be available in libvirt, so we can retest. See also: bug 609654 and bug 1589627. Adding the libvirt blockdev BZ as a dependency. (In reply to Ademar Reis from comment #8) > Need to wait for blockdev-add infrastructure to be available in libvirt, so > we can retest. See also: bug 609654 and bug 1589627. I meant bug 1609654 (typo). (In reply to Fam Zheng from comment #4) > Meanwhile I'm posting a middle ground solution to upstream as RFC: always > include "hint" messages in QMP error texts, which has the advantages of both > sides: include hint in QMP errors generically so all errors can benefit, and > no libvirt change is necessary. This seems to be in QEMU 3.1: commit b857431d2abe3945b672b41f33690e9943a8752a Author: Fam Zheng <famz> Date: Tue Sep 25 13:05:01 2018 +0800 file-posix: Include filename in locking error message Image locking errors happening at device initialization time doesn't say which file cannot be locked, for instance, -device scsi-disk,drive=drive-1: Failed to get shared "write" lock Is another process using the image? could refer to either the overlay image or its backing image. Hoist the error_append_hint to the caller of raw_check_lock_bytes where file name is known, and include it in the error hint. Signed-off-by: Fam Zheng <famz> Reviewed-by: Eric Blake <eblake> Signed-off-by: Kevin Wolf <kwolf> (In reply to Ademar Reis from comment #10) > This seems to be in QEMU 3.1: That commit is part of my current backport of BZ 1551486 (to reduce conflicts). Max Verified this bug as below: Tested with: qemu-kvm-rhev-2.12.0-26.el7 kernel-3.10.0-1025.el7 Steps: Scenario 1 (For qcow2 files) 1. Boot a guest with no data disk (qemu) info block drive_image1 (#block181): /home/kvm_autotest_root/images/rhel77-64-virtio-scsi.qcow2 (qcow2) Attached to: /machine/peripheral/virtio_blk_pci0/virtio-backend Cache mode: writethrough, ignore flushes 2. Hotplug the same qcow2 image twice in QMP 2.1 First time {"QMP": {"version": {"qemu": {"micro": 0, "minor": 12, "major": 2}, "package": "qemu-kvm-rhev-2.12.0-26.el7"}, "capabilities": []}} {"execute":"qmp_capabilities"} {"return": {}} { "execute": "blockdev-add", "arguments": { "node-name": "driver2", "driver": "qcow2", "file": { "driver": "file", "filename": "/home/test/data.qcow2" } } } {"return": {}} { "execute":"device_add", "arguments":{ "driver":"virtio-scsi-pci", "id":"shuang-pci" } } {"return": {}} { "execute":"device_add", "arguments":{ "driver":"scsi-hd", "drive":"driver2", "id":"shuang-hd" } } {"return": {}} 2.2 Second time { "execute": "blockdev-add", "arguments": { "node-name": "driver3", "driver": "qcow2", "file": { "driver": "file", "filename": "/home/test/data.qcow2" } } } {"error": {"class": "GenericError", "desc": "Failed to get \"write\" lock"}} Scenario 2(For raw file) Hotplug for first time { "execute": "blockdev-add", "arguments": { "node-name": "driver3", "driver": "raw", "file": { "driver": "file", "filename": "/home/test/data.img" } } } {"return": {}} { "execute":"device_add", "arguments":{ "driver":"scsi-hd", "drive":"driver3", "id":"shuang-hd1" } } {"return": {}} Hotplug for second time { "execute": "blockdev-add", "arguments": { "node-name": "driver4", "driver": "raw", "file": { "driver": "file", "filename": "/home/test/data.img" } } } {"return": {}} { "execute":"device_add", "arguments":{ "driver":"scsi-hd", "drive":"driver4", "id":"shuang-hd2" } } {"error": {"class": "GenericError", "desc": "Failed to get \"write\" lock"}} Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2553 |