Bug 1526212
Summary: | qemu-img should not need a write lock for creating the overlay image | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Han Han <hhan> |
Component: | qemu-kvm-rhev | Assignee: | Fam Zheng <famz> |
Status: | CLOSED ERRATA | QA Contact: | Ping Li <pingl> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.5 | CC: | areis, chayang, chhu, coli, dyuan, dzheng, famz, hhan, juzhang, lmiksik, michen, mrezanin, ngu, nsoffer, pingl, ratamir, timao, virt-maint, xuzhang, yafu, yanqzhan |
Target Milestone: | rc | Keywords: | Regression, TestBlocker |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.10.0-17.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-11 00:55:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1525303, 1533155 |
Description
Han Han
2017-12-15 02:15:32 UTC
Posted the fix to upstream: https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg02816.html The issue also exists when try to convert a image used by guest. Steps to reproduce: 1.#/usr/libexec/qemu-kvm /var/lib/libvirt/images/data/img VNC server running on ::1:5900 2.#/usr/bin/qemu-img convert -f qcow2 -O qcow2 -o compat=1.1 /var/lib/libvirt/images/data.img /var/lib/libvirt/images/data.img-clone qemu-img: Could not open '/var/lib/libvirt/images/data.img': Failed to get shared "write" lock Is another process using the image? (In reply to yafu from comment #4) > The issue also exists when try to convert a image used by guest. Don't do this. The live format conversion of image can be done with drive-mirror and the live backup can be done with drive-backup, in QMP. Reading qcow2 guest visible data is not allowed from qemu-img or any other programs when it is used by guest. *** Bug 1529990 has been marked as a duplicate of this bug. *** Fam, we get same error when using qemu-img info on an image used by a guest: From vdsm log: Error: Command ['/usr/bin/qemu-img', 'info', '--output', 'json', '-f', 'qcow2', u'/rhev/data-center/mnt/rich-nfs-server2.usersys.redhat.com:_home_storage_sd5/43bdddd5-2edd-45f5-a55e-c08cd36648a6/images/64e7f158-7829-4933-8b80-e785b72ebf6d/244d8cd8-9782-4e1e-8b84-77016ca11406'] failed with rc=1 out='' err='qemu-img: Could not open \'/rhev/data-center/mnt/rich-nfs-server2.usersys.redhat.com:_home_storage_sd5/43bdddd5-2edd-45f5-a55e-c08cd36648a6/images/64e7f158-7829-4933-8b80-e785b72ebf6d/244d8cd8-9782-4e1e-8b84-77016ca11406\': Failed to get shared "write" lock\nIs another process using the image?\n' We run qemu-img info on an image to get the qcow2 compat value, and taking a lock in this case breaks released versions of RHV. I think info should behave in the same way as create in this case. The idea is that '-U' should be used with "qemu-img info": # qemu-img info -U ... when the image is qcow2 AND being used by guest. The reason is that this is inherently racy with the QEMU process and may yield unexpected results, if the image is being updated. The situation is the same as the other changes of behaviors due to image locking. <sharable/> won't work with new QEMU and old libvirt, similarly. Can RHV be updated to use "-U" in this case? We will use -U in RHV 4.2[1] - but RHV 4.1 that must run with 7.5 is already out there. Users updating the 7.5 will be hit by this backward incompatible change. We know that accessing an image header is racy, but we need it only for getting the qcow2 compat value, and it is not expected to be modified by qemu. [1] https://gerrit.ovirt.org/85874 OK, it is probably okay to "fallback" to -U automatically for "qemu-img info" if the locking failed. I'll send a patch to upstream to see how maintainers think. The patch mentioned in comment 10: https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00845.html (In reply to Fam Zheng from comment #11) > https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00845.html We would like to test this fix when a build is available. I made a scratch build with the fix: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=14889521 (In reply to Fam Zheng from comment #10) > OK, it is probably okay to "fallback" to -U automatically for "qemu-img > info" if the locking failed. I'll send a patch to upstream to see how > maintainers think. (In reply to Fam Zheng from comment #11) > The patch mentioned in comment 10: > > https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00845.html Looks like the consensus upstream is to drop this patch. Nir: can you workaround this in RHV, or will we need a downstream-only patch? Besides that, is the original report still accurate? This BZ was originally opened as a TestBlocker/Regression, but this issue with qemu-img -U looks like expected behavior on the QEMU side. Fix included in qemu-kvm-rhev-2.10.0-17.el7 Hi Ademar, Nir, According to the below test result, I tend to set the bug as verified. If we still intend to set force shared option "-U" as default option for "qemu-img info", could we create a new bug to track it? Thanks Packages tested: qemu-kvm-rhev-2.10.0-17.el7 kernel-3.10.0-829.el7.x86_64 Test steps: 1. Create backing image using qcow2 format 1) Create backing image and boot the image # qemu-img create -f qcow2 base.qcow2 1G # /usr/libexec/qemu-kvm -nodefaults -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -monitor stdio 2) Create snapshot # qemu-img create -f qcow2 -b base.qcow2 -F qcow2 sn.qcow2 3) Get information of the snapshot # qemu-img info sn.qcow2 image: sn.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 196K cluster_size: 65536 backing file: base.qcow2 backing file format: qcow2 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4) Check the snapshot # qemu-img check sn.qcow2 qemu-img: Could not open 'sn.qcow2': Could not open backing file: Failed to get shared "write" lock Is another process using the image? 2. Create backing image using raw format 1) Create backing image and boot the image # qemu-img create -f raw base.img 1G # /usr/libexec/qemu-kvm -nodefaults -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=/home/tests/diskfile/base.img -device scsi-hd,id=image1,drive=drive_image1 -monitor stdio 2) Create snapshot # qemu-img create -f qcow2 -b base.img -F raw sn.qcow2 3) Get information of the snapshot # qemu-img info sn.qcow2 image: sn.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 196K cluster_size: 65536 backing file: base.img backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4) Check the snapshot ----> Checked with Fam, raw is a bit more permissive for concurrent openers # qemu-img check sn.qcow2 No errors were found on the image. Image end offset: 262144 3. Create backing image using luks format 1) Create backing image and boot the image # qemu-img create -f luks --object secret,id=sec0,data=base -o key-secret=sec0 base.luks 1G # /usr/libexec/qemu-kvm -nodefaults -object secret,id=sec0,data=base -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=luks,file=/home/tests/diskfile/base.luks,key-secret=sec0 -device scsi-hd,id=image1,drive=drive_image1 -monitor stdio 2) Create snapshot # qemu-img create -f qcow2 --object secret,id=sec0,data=base -b 'json:{"driver": "luks", "file": {"driver": "file", "filename": "/home/tests/diskfile/base.luks"}, "key-secret": "sec0"}' sn.qcow2 3) Get information of the snapshot # qemu-img info sn.qcow2 image: sn.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 196K cluster_size: 65536 backing file: json:{"driver": "luks", "file": {"driver": "file", "filename": "/home/tests/diskfile/base.luks"}, "key-secret": "sec0"} Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4) Check the snapshot # qemu-img check --object secret,id=sec0,data=base --image-opts driver=qcow2,file.filename=sn.qcow2,backing.key-secret=sec0 qemu-img: Could not open 'driver=qcow2,file.filename=sn.qcow2,backing.key-secret=sec0': Could not open backing file: Failed to get shared "write" lock Is another process using the image? 4. Create backing image using qcow2 format encrypted by luks 1) Create backing image and boot the image # qemu-img create --object secret,id=sec0,data=base -f qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 base.qcow2 20G # /usr/libexec/qemu-kvm -nodefaults -object secret,id=sec0,data=base -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2,encrypt.key-secret=sec0 -device scsi-hd,id=image1,drive=drive_image1 -monitor stdio 2) Create snapshot # qemu-img create --object secret,id=sec0,data=base -f qcow2 -b 'json:{"encrypt.key-secret": "sec0", "driver": "qcow2", "file": {"driver": "file", "filename": "/home/tests/diskfile/base.qcow2"}}' sn.qcow2 3) Get information of the snapshot # qemu-img info sn.qcow2 image: sn.qcow2 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 196K cluster_size: 65536 backing file: json:{"encrypt.key-secret": "sec0", "driver": "qcow2", "file": {"driver": "file", "filename": "/home/tests/diskfile/base.qcow2"}} Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false 4) Check the snapshot # qemu-img check --object secret,id=sec0,data=base --image-opts driver=qcow2,file.filename=sn.qcow2,backing.encrypt.key-secret=sec0 qemu-img: Could not open 'driver=qcow2,file.filename=sn.qcow2,backing.encrypt.key-secret=sec0': Could not open backing file: Failed to get shared "write" lock Is another process using the image? (In reply to Ping Li from comment #18) > Hi Ademar, Nir, > > According to the below test result, I tend to set the bug as verified. If we > still intend to set force shared option "-U" as default option for "qemu-img > info", could we create a new bug to track it? Thanks That's correct. Nir: please open a new BZ if RHV needs the -U change. (In reply to Ademar Reis from comment #19) > (In reply to Ping Li from comment #18) > > Hi Ademar, Nir, > > > > According to the below test result, I tend to set the bug as verified. If we > > still intend to set force shared option "-U" as default option for "qemu-img > > info", could we create a new bug to track it? Thanks > > > That's correct. Nir: please open a new BZ if RHV needs the -U change. NEEDINFO(Nir and Han) (In reply to Ademar Reis from comment #20) > (In reply to Ademar Reis from comment #19) > > (In reply to Ping Li from comment #18) > > > Hi Ademar, Nir, > > > > > > According to the below test result, I tend to set the bug as verified. If we > > > still intend to set force shared option "-U" as default option for "qemu-img > > > info", could we create a new bug to track it? Thanks > > > > > > That's correct. Nir: please open a new BZ if RHV needs the -U change. > > NEEDINFO(Nir and Han) Filed a new bug to track this: https://bugzilla.redhat.com/show_bug.cgi?id=1535992 According to the comment 18 and comment 21, set the bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |