Bug 1378241 - QEMU image file locking
Summary: QEMU image file locking
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 7.4
Assignee: Fam Zheng
QA Contact: Ping Li
URL:
Whiteboard:
: 1371458 (view as bug list)
Depends On: 1444778 1461231
Blocks: 1378242 1415250 1415252 1417306 1432523 1469590
TreeView+ depends on / blocked
 
Reported: 2016-09-21 21:51 UTC by Ademar Reis
Modified: 2018-04-11 00:09 UTC (History)
30 users (show)

Fixed In Version: qemu-kvm-rhev-2.10.0-3.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1378242 1417306 1432523 (view as bug list)
Environment:
Last Closed: 2018-04-11 00:09:32 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1104 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2018-04-10 22:54:38 UTC

Description Ademar Reis 2016-09-21 21:51:48 UTC
QEMU image locking, which should prevent multiple runs of QEMU or qemu-img when a VM is running.

Upstream series (v7): https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg01306.html

Comment 6 Fam Zheng 2017-01-23 06:27:15 UTC
*** Bug 1371458 has been marked as a duplicate of this bug. ***

Comment 8 Fam Zheng 2017-02-13 12:58:14 UTC
We deliberately want QEMU to lock the images by itself instead of relying on libvirt or RHV, to protect manual invocation of QEMU commmands, such as 'qemu-img snapshot' against an in-use qcow2 image, which is a very common but dangerous misuse.

IMO this, SANlock and libvirt virtlockd (which has a SANlock plugin too) can go together.

Comment 9 Martin Tessun 2017-02-13 13:42:47 UTC
(In reply to Fam Zheng from comment #8)
> We deliberately want QEMU to lock the images by itself instead of relying on
> libvirt or RHV, to protect manual invocation of QEMU commmands, such as
> 'qemu-img snapshot' against an in-use qcow2 image, which is a very common
> but dangerous misuse.
> 

Ack. So does this also prevent to use this volume from another host that has access to the same block device where the image does live on?
If not, I believe we are missing the most important usecase for RHV and OSP (let's say enterprise) installations, as these typically have more hosts that might access the image concurrently.

> IMO this, SANlock and libvirt virtlockd (which has a SANlock plugin too) can
> go together.

Ack. Maybe to resolve the above concerns a "joint" approach is needed between qemu and libvirt to ensure that also "remote access" to the image is denied.

Comment 10 Fam Zheng 2017-02-13 14:07:45 UTC
(In reply to Martin Tessun from comment #9)
> So does this also prevent to use this volume from another host that has
> access to the same block device where the image does live on?

The current work in progress is focusing on file system based lock (OFD locks) that protects local images and NFS, but hasn't considered block based solutions like SANlock or SCSI reservation. Gluster and ceph are developing their own locking implementations which could be integrated (or even transparently beneficial) to QEMU in the future.

Comment 11 Ademar Reis 2017-02-13 14:41:15 UTC
(In reply to Martin Tessun from comment #9)
> (In reply to Fam Zheng from comment #8)
> > We deliberately want QEMU to lock the images by itself instead of relying on
> > libvirt or RHV, to protect manual invocation of QEMU commmands, such as
> > 'qemu-img snapshot' against an in-use qcow2 image, which is a very common
> > but dangerous misuse.
> > 
> 
> Ack. So does this also prevent to use this volume from another host that has
> access to the same block device where the image does live on?
> If not, I believe we are missing the most important usecase for RHV and OSP
> (let's say enterprise) installations, as these typically have more hosts
> that might access the image concurrently.
>

I fully agree and we have discussed this in the recent past, concluding that the only viable solution appears to be some sort of monitoring done by libvirt on all block devices being used by a VM:

    Bug 1337005 - Log event when a block device in use by a guest is open read-write by external applications

The bug description contains a good explanation for the motivation and the use-cases that it covers.

But we (libvirt) need to investigate and better research the idea and potential solutions.

Comment 12 Daniel Berrangé 2017-03-08 15:33:49 UTC
(In reply to Ademar Reis from comment #11)
> (In reply to Martin Tessun from comment #9)
> > (In reply to Fam Zheng from comment #8)
> > > We deliberately want QEMU to lock the images by itself instead of relying on
> > > libvirt or RHV, to protect manual invocation of QEMU commmands, such as
> > > 'qemu-img snapshot' against an in-use qcow2 image, which is a very common
> > > but dangerous misuse.
> > > 
> > 
> > Ack. So does this also prevent to use this volume from another host that has
> > access to the same block device where the image does live on?
> > If not, I believe we are missing the most important usecase for RHV and OSP
> > (let's say enterprise) installations, as these typically have more hosts
> > that might access the image concurrently.
> >
> 
> I fully agree and we have discussed this in the recent past, concluding that
> the only viable solution appears to be some sort of monitoring done by
> libvirt on all block devices being used by a VM:
> 
>     Bug 1337005 - Log event when a block device in use by a guest is open
> read-write by external applications
> 
> The bug description contains a good explanation for the motivation and the
> use-cases that it covers.
> 
> But we (libvirt) need to investigate and better research the idea and
> potential solutions.

Libvirt already has a framework for disk locking that can protect against concurrent access from multiple hosts. It has two plugins, one using sanlock, and one using virtlockd. The virtlockd daemon uses fcntl() locks, but does not have to apply them directly to the disk file. It can use a lock-aside lock file  based on some unique identifier. For example, you can mount an NFS vol at /var/lib/libvirt/lockspace, and use the SCSI  WWID or LVM UUID as the name of a lock file to create in that directory. This of course requires that all nodes in the cluster have that NFS vol mounted, but it is does not require high bandwidth data transfer for this volume. There is also scope for creating new plugins for libvirt to use other mechanism if someone comes up with other ideas.

Comment 13 Ping Li 2017-03-14 10:19:14 UTC
Hi Fam,

As the flag rhel-7.4.0+ is set, qe wonder to know whether the issue will be fixed in rhel 7.4? Then qe can add it to RHEL 7.4 test plan for disk format. 

If we have plan to fix it in rhel 7.4, could you help to check the verify steps as below.
scenario 1(the image is located on local file system/nfs/iscsi/ceph/gluster):
boot the same image twice, check whether the second boot is rejected
scenario 2(the image is located on local file system/nfs/iscsi/ceph/gluster):
boot an image, and use "qemu-img info" to get the image info. Then check whether the latter is rejected.

Thanks.

Comment 22 Fam Zheng 2017-06-13 06:24:18 UTC
The image locking patches are merged for upstream as commit 244a5668106297378391b768e7288eb157616f64.

Comment 24 Ping Li 2017-10-11 10:54:36 UTC
Hi Fam,

I tried the image locking as below, but seems the feature is not enabled. I'm not sure whether it has relation with the kernel.

Version-Release number of selected component
# uname -r
3.10.0-732.el7.x86_64
# /usr/libexec/qemu-kvm -version
QEMU emulator version 2.10.0(qemu-kvm-rhev-2.10.0-1.el7)
# /usr/bin/qemu-img --version
qemu-img version 2.10.0(qemu-kvm-rhev-2.10.0-1.el7)

Test steps:
1. Boot up the guest with created image
/usr/libexec/qemu-kvm \
    ...
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/rhel75.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x3 \

2. Check whether multiple runs of qemu or qemu-img could be prevented 
2.1 Boot up the image again
/usr/libexec/qemu-kvm \
    ...
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/rhel75.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x3

2.2 Check the image
# /usr/bin/qemu-img check /home/tests/diskfile/rhel75.qcow2
No errors were found on the image.
23360/327680 = 7.13% allocated, 13.84% fragmented, 0.00% compressed clusters
Image end offset: 1531772928

2.3 Get the image info
# /usr/bin/qemu-img info /home/tests/diskfile/rhel75.qcow2
image: /home/tests/diskfile/rhel75.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

2.4 Boot up the snapshot
/usr/libexec/qemu-kvm \
    ...
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/sn.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x3

But all the above operations can be executed successfully. 
# lslocks 
COMMAND           PID  TYPE SIZE MODE  M START END PATH
iscsid          13463 POSIX   6B WRITE 0     0   0 /run/iscsid.pid
sendmail        19233 POSIX  34B WRITE 0     0   0 /run/sendmail.pid
rhsmcertd        1307 FLOCK   0B WRITE 0     0   0 /run/lock/subsys/rhsmcertd
crond            3994 FLOCK   5B WRITE 0     0   0 /run/crond.pid
(unknown)        1321 FLOCK   0B WRITE 0     0   0 /run
libvirtd         1425 POSIX   4B WRITE 0     0   0 /run/libvirtd.pid
atd              1453 POSIX   5B WRITE 0     0   0 /run/atd.pid
lvmetad           717 POSIX   4B WRITE 0     0   0 /run/lvmetad.pid
sendmail        19255 POSIX  50B WRITE 0     0   0 /run/sm-client.pid

additional info:
When i tested upstream qemu, image locking could work smoothly.

Version-Release number of selected component
kernel-4.5.5-201.fc23.x86_64
$ ./x86_64-softmmu/qemu-system-x86_64 -version
QEMU emulator version 2.10.0
$ ./qemu-img --v
qemu-img version 2.10.0

Test steps:
1. Boot up guest with the image
$ ./x86_64-softmmu/qemu-system-x86_64 test.qcow2

2. Check whether multiple runs of qemu or qemu-img could be prevented 
$ ./qemu-img info test.qcow2 
qemu-img: Could not open 'test.qcow2': Failed to get shared "write" lock
Is another process using the image?
$ ./qemu-img check test.qcow2 
qemu-img: Could not open 'test.qcow2': Failed to get shared "write" lock
Is another process using the image?
$ ./x86_64-softmmu/qemu-system-x86_64 test.qcow2 
qemu-system-x86_64: Failed to get "write" lock
Is another process using the image?

$ sudo lslocks | grep qemu
qemu-system-x86 14706 OFDLCK 192.5K READ  0        100        101 /home/nhire/workspace/software/qemu-2.10.0/test.qcow2
qemu-system-x86 14706 OFDLCK 192.5K READ  0        103        103 /home/nhire/workspace/software/qemu-2.10.0/test.qcow2
qemu-system-x86 14706 OFDLCK 192.5K READ  0        201        201 /home/nhire/workspace/software/qemu-2.10.0/test.qcow2
qemu-system-x86 14706 OFDLCK 192.5K READ  0        203        203 /home/nhire/workspace/software/qemu-2.10.0/test.qcow2

Comment 25 Fam Zheng 2017-10-11 12:53:53 UTC
The reason is indeed the kernel and glibc (as in the depended BZs 1444778 1461231). But I think we can already do something here. I'll take a look tomorror and sort out the plan.

Comment 27 Ping Li 2017-10-15 15:32:50 UTC
Hi Fam,

Almost passed the test with the scratch build. When booting a vm with snapshot, found checking backing file or getting information of backing file is allowed. Could you help to check whether it is expected? Thanks


Version-Release number of selected component:
# uname -r
3.10.0-734.el7.x86_64
# /usr/bin/qemu-img --v
qemu-img version 2.10.0(qemu-kvm-rhev-2.10.0-1.el7.fzheng201710122121)
# /usr/libexec/qemu-kvm -version
QEMU emulator version 2.10.0(qemu-kvm-rhev-2.10.0-1.el7.fzheng201710122121)

Test steps:
1. Boot the same image twice, inspect whether the second boot is rejected.
1.1 Test the image on local file system.
1) Boot a guest with image.
2) Boot a second guest with the same image.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2: Failed to get "write" lock
Is another process using the image?

1.2 Test the image on nfs backend.
1) Boot a guest with image.
2) Boot a second guest with the same image.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/nfs/nfs.qcow2: Failed to get "write" lock
Is another process using the image?

2. Boot a guest with the image, inspect whether checking the image and getting the information about the image are rejected.
2.1 Test the image on local file system.
1) Boot a guest with image.
2) Check the image.
# qemu-img check base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?
3) Get the information of the image.
# qemu-img info base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?

2.2 Test the image on nfs backend.
1) Boot a guest with image.
2) Check the image.
# qemu-img check nfs.qcow2 
qemu-img: Could not open 'nfs.qcow2': Failed to get shared "write" lock
Is another process using the image?
3) Get the information of the image.
# qemu-img info nfs.qcow2 
qemu-img: Could not open 'nfs.qcow2': Failed to get shared "write" lock
Is another process using the image?

3. Boot a snapshot, inspect whether booting backing file is rejected.
3.1 Test the image on local file system.
1) Boot a guest with the snapshot.
2) Boot a second guest with the backing file.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2: Failed to get "write" lock
Is another process using the image?
3) Check the backing file.
# qemu-img check base.qcow2 
No errors were found on the image.
20360/327680 = 6.21% allocated, 13.55% fragmented, 0.00% compressed clusters
Image end offset: 1335164928
4) Get the information of the backing file.
# qemu-img info base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.2G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3.2 Test the image on nfs backend.
1) Boot a guest with the snapshot.
2) Boot a second guest with the backing file.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/nfs/nfs.qcow2: Failed to get "write" lock
Is another process using the image?
3) Check the backing file.
# qemu-img check nfs.qcow2 
No errors were found on the image.
21565/327680 = 6.58% allocated, 22.37% fragmented, 0.00% compressed clusters
Image end offset: 1414201344
4) Get the information of the backing file.
# qemu-img info nfs.qcow2 
image: nfs.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.3G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

4. Boot a guest with the image, inspect whether checking the image and getting the information about the image are allowed with the option "-U".
4.1 Test the image on local file system.
1) Boot a guest with image.
2) Check the image with option "-U".
# qemu-img check -U base.qcow2 
No errors were found on the image.
20360/327680 = 6.21% allocated, 13.55% fragmented, 0.00% compressed clusters
Image end offset: 1335164928
3) Get the information of the image with option "-U".
# qemu-img info -U base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.2G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

4.2 Test the image on local file system.
1) Boot a guest with image.
2) Check the image with option "-U".
# qemu-img check -U nfs.qcow2 
No errors were found on the image.
21565/327680 = 6.58% allocated, 22.37% fragmented, 0.00% compressed clusters
Image end offset: 1414201344
3) Get the information of the image with option "-U".
# qemu-img info -U nfs.qcow2 
image: nfs.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.3G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 28 Fam Zheng 2017-10-17 03:49:26 UTC
(In reply to pingl from comment #27)
> Hi Fam,
> 
> Almost passed the test with the scratch build. When booting a vm with
> snapshot, found checking backing file or getting information of backing file
> is allowed. Could you help to check whether it is expected? Thanks

Yes it is, backing files are used by guest as "readonly", so "qemu-img info" opening it readonly is allowed.

Comment 29 Miroslav Rezanina 2017-10-20 09:30:37 UTC
Fix included in qemu-kvm-rhev-2.10.0-3.el7

Comment 31 Ping Li 2017-10-24 09:02:00 UTC
Passed the tests as below, Set the bug as verified.

Version-Release number of selected component:
kernel-3.10.0-745.el7.x86_64
qemu-kvm-rhev-2.10.0-3.el7

Test steps:
1. Boot the same image twice, inspect whether the second boot is rejected.
1.1 Test the image on local file system.
1) Boot a guest with image.
(qemu) info block
drive_image1 (#block151): /home/tests/diskfile/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Boot a second guest with the same image.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2: Failed to get "write" lock
Is another process using the image?

1.2 Test the image on nfs backend.
1) Boot a guest with image.
(qemu) info  block
drive_image1 (#block151): /home/tests/nfs/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Boot a second guest with the same image.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/nfs/base.qcow2: Failed to get "write" lock
Is another process using the image?

2. Boot a guest with the image, inspect whether checking the image and getting the information about the image are rejected.
2.1 Test the image on local file system.
1) Boot a guest with image.
(qemu) info block
drive_image1 (#block151): /home/tests/diskfile/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Check the image.
# qemu-img check base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?
3) Get the information of the image.
# qemu-img info base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?

2.2 Test the image on nfs backend.
1) Boot a guest with image.
(qemu) info block
drive_image1 (#block112): /home/tests/nfs/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Check the image.
# qemu-img check base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?
3) Get the information of the image.
# qemu-img info base.qcow2 
qemu-img: Could not open 'base.qcow2': Failed to get shared "write" lock
Is another process using the image?

3. Boot a snapshot, inspect whether booting backing file is rejected.
3.1 Test the image on local file system.
1) Boot a guest with the snapshot.
(qemu) info block
drive_image1 (#block150): /home/tests/diskfile/sn.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
    Backing file:     /home/tests/diskfile/base.qcow2 (chain depth: 1)
2) Boot a second guest with the backing file.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/diskfile/base.qcow2: Failed to get "write" lock
Is another process using the image?
3) Check the backing file.
# qemu-img check base.qcow2 
No errors were found on the image.
22276/327680 = 6.80% allocated, 13.36% fragmented, 0.00% compressed clusters
Image end offset: 1460797440
4) Get the information of the backing file.
# qemu-img info base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

3.2 Test the image on nfs backend.
1) Boot a guest with the snapshot.
(qemu) info block
drive_image1 (#block112): /home/tests/nfs/sn.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
    Backing file:     /home/tests/nfs/base.qcow2 (chain depth: 1)
2) Boot a second guest with the backing file.
qemu-kvm: -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/tests/nfs/base.qcow2: Failed to get "write" lock
Is another process using the image?
3) Check the backing file.
# qemu-img check base.qcow2 
No errors were found on the image.
23164/327680 = 7.07% allocated, 19.66% fragmented, 0.00% compressed clusters
Image end offset: 1518993408
4) Get the information of the backing file.
# qemu-img info base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

4. Boot a guest with the image, inspect whether checking the image and getting the information about the image are allowed with the option "-U".
4.1 Test the image on local file system.
1) Boot a guest with image.
(qemu) info block
drive_image1 (#block135): /home/tests/diskfile/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Check the image with option "-U".
# qemu-img check -U base.qcow2 
No errors were found on the image.
22276/327680 = 6.80% allocated, 13.36% fragmented, 0.00% compressed clusters
Image end offset: 1460797440
3) Get the information of the image with option "-U".
# qemu-img info -U base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

4.2 Test the image on local file system.
1) Boot a guest with image.
(qemu) info block
drive_image1 (#block149): /home/tests/nfs/base.qcow2 (qcow2)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writethrough, direct
2) Check the image with option "-U".
# qemu-img check -U base.qcow2 
No errors were found on the image.
23164/327680 = 7.07% allocated, 19.66% fragmented, 0.00% compressed clusters
Image end offset: 1518993408
3) Get the information of the image with option "-U".
# qemu-img info -U base.qcow2 
image: base.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 32 yisun 2017-11-03 15:04:47 UTC
Hi Fam, 
Since we are using "qemu-img info" to check some vm's image size in libvirt test scripts, so just want to make sure if we need to change our scripts. 

=================
Now when a vm is started, we cannot use qemu-img info to get image info, as follow:

# virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

# virsh domblklist avocado-vt-vm1
Target     Source
------------------------------------------------
vda        /var/lib/avocado/data/avocado-vt/images/jeos-25-64.qcow2
vdb        /var/lib/libvirt/images/test.qcow2

# qemu-img info /var/lib/libvirt/images/test.qcow2
qemu-img: Could not open '/var/lib/libvirt/images/test.qcow2': Failed to get shared "write" lock
Is another process using the image?

=================
So the problem is "qemu-img" will require a "write" lock, even if "qemu-img info" is actually more like a "read" command. 

But anyway, could you just help to confirm if this will always be like this from now on, if so, we'll modify our method to get image info when vm's running, such as use "virsh domblkinfo" or purely use "du" instead of "qemu-img info"

Thanks in advance.

Comment 33 Fam Zheng 2017-11-06 05:17:37 UTC
You can use "qemu-img info -U" to skip the locking but what that means is you risk getting inconsistent info when the image is in use.

Comment 34 jiyan 2018-01-11 11:33:31 UTC
Hi Fam
I tested another corner scenario, could you help to check it whether it is a problem?

Version:
# rpm -qa libvirt qemu-kvm-rhev kernel
kernel-3.10.0-826.el7.x86_64
libvirt-3.9.0-7.virtcov.el7.x86_64
qemu-kvm-rhev-2.10.0-15.el7.x86_64

Test scenario:
Scenario-1: Same as the following test scenario
https://bugzilla.redhat.com/show_bug.cgi?id=1378241#c31
3. Boot a snapshot, inspect whether booting backing file is rejected.

# virsh domstate test1;virsh domstate test2
shut off

shut off

# virsh dumpxml test1 |grep "<disk" -A5;virsh dumpxml test2|grep "<disk" -A5
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/qcow2'/>
      <target dev='hda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2'/>
      <target dev='hda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

# qemu-img info /var/lib/libvirt/images/qcow2 -U
image: /var/lib/libvirt/images/qcow2
file format: qcow2
backing file: /var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2
backing file format: qcow2

# virsh start test1;virsh start test2
Domain test1 started

error: Failed to start domain test2
error: internal error: process exited while connecting to monitor: 
2018-01-11T11:22:51.919166Z qemu-kvm: -drive file=/var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none: Failed to get "write" lock
Is another process using the image?

Scenario-2: Change the type of 'snapshot' disk as 'raw', both can start successfully
# virsh domstate test1;virsh domstate test2
shut off

shut off

# virsh dumpxml test1 |grep "<disk" -A5;virsh dumpxml test2|grep "<disk" -A5
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/libvirt/images/qcow2'/>
      <target dev='hda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/RHEL-7.5-x86_64-latest.qcow2'/>
      <target dev='hda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

# virsh start test1;virsh start test2
Domain test1 started

Domain test2 started

Comment 35 Fam Zheng 2018-01-16 02:09:33 UTC
Force interpreting a qcow2 image as raw is wrong, so this is not a bug, but a user error. When you treat the qcow2 image as raw, it will not link back to the base image, that's why both can boot.

Comment 39 errata-xmlrpc 2018-04-11 00:09:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104


Note You need to log in before you can comment on or make changes to this bug.