Bug 1741620 - qemu-nbd and qemu-img convert can write to the same image
Summary: qemu-nbd and qemu-img convert can write to the same image
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-15 15:31 UTC by Nir Soffer
Modified: 2020-02-11 17:58 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)

Description Nir Soffer 2019-08-15 15:31:11 UTC
Description of problem:

When qemu-nbd is using a qcow2 image, it takes a write lock, and qemu-img info
will fail to access the image without the -U option. However when using raw
image, qemu-img info does not require any lock. We can even write to the image
used by qemu-nbd with qemu-img convert.

Here is how to reproduce:

$ qemu-img create -f raw test.raw 1g
$ qemu-img create -f qcow2 test.qcow2 1g

$ qemu-nbd --socket /tmp/nbd.sock --format qcow2 --persistent --export-name= \
    --cache=none --aio=native test.qcow2

From another shell:

$ qemu-img info test.qcow2
qemu-img: Could not open 'test.qcow2': Failed to get shared "write" lock
Is another process using the image [test.qcow2]?

(expected)

$ qemu-img info -U test.qcow2
image: test.qcow2
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: 196 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

(expected)

$ qemu-nbd --socket /tmp/nbd.sock --format raw --persistent --export-name= \
    --cache=none --aio=native test.raw

$ qemu-img info test.raw
image: test.raw
file format: raw
virtual size: 1 GiB (1073741824 bytes)
disk size: 0 B

No locking needed...

Lets try to write to test.raw, behind qemu-nbd back:

$ qemu-img convert -n -p -f qcow2 -O raw -t none -T none test.qcow2 test.raw
    (100.00/100%)

Looks like qemu-nbd does not lock raw images.


Version-Release number of selected component (if applicable):
qemu-img-4.1.0-0.1.rc2.fc29.x86_64 (from virt-preview repo)

How reproducible:
Always

This can lead to data corruption if user try to modify a disk image while it 
served by qemu-nbd in read-write mode.

I think we have the same issue in CentOS, since we see the same behavior
in oVirt on CentOS/RHEL.

Comment 1 Nir Soffer 2019-08-15 15:44:31 UTC
I forgot to mention that this was reproduced on local xfs file system
and on gluster mount.

Comment 2 Richard W.M. Jones 2019-08-15 16:15:20 UTC
I'm a bit confused here.  What are you expecting to happen that a lock would prevent?
qemu-img info is only showing you the size of the file.

Comment 3 Nir Soffer 2019-08-15 16:22:57 UTC
True, qemu-img info on a raw file is pretty safe, but this is true also
for qcow2 image, it only looks in the header.

But allowing qemu-img convert to write to the image served by qemu-nbd means
2 users can write to the same image at time.

Comment 4 Nir Soffer 2019-08-15 16:33:36 UTC
Testing with qemu show that qemu-nbd fail to access the image used by qemu:

$ qemu-kvm -drive file=/var/tmp/test.raw,format=raw
...

$ qemu-nbd --socket /tmp/nbd.sock --format raw --persistent --export-name=     --cache=none --aio=native /var/tmp/test.raw
qemu-nbd: Failed to blk_new_open '/var/tmp/test.raw': Failed to get "write" lock
Is another process using the image [/var/tmp/test.raw]?

$ qemu-img convert -f qcow2 -O raw -t none -T none /var/tmp/test.qcow2 /var/tmp/test.raw 
qemu-img: /var/tmp/test.raw: error while converting raw: Failed to get "write" lock
Is another process using the image [/var/tmp/test.raw]?

$ qemu-img convert -n -f qcow2 -O raw -t none -T none /var/tmp/test.qcow2 /var/tmp/test.raw 
qemu-img: Could not open '/var/tmp/test.raw': Failed to get "write" lock
Is another process using the image [/var/tmp/test.raw]?

So maybe the issue is that qemu-nbd and qemu-image convert take a shared lock, instead
of exclusive lock?

Comment 5 Nir Soffer 2019-08-15 17:36:20 UTC
Here is another example of 2 qemu-img convert processes writing happily to the same file:

$ for i in 1 2; do (qemu-img convert -n -f raw -O raw -t none -T none src.raw dst.raw &); done

$ ps auxf
...
nsoffer  28419 11.2  0.0 780692 22424 pts/4    Sl   20:30   0:00  \_ qemu-img convert -n -f raw -O raw -t none -T none src.raw dst.raw
nsoffer  28421 11.2  0.0 706960 22196 pts/4    Sl   20:30   0:00  \_ qemu-img convert -n -f raw -O raw -t none -T none src.raw dst.raw

Comment 6 Nir Soffer 2019-08-15 17:42:33 UTC
Another example, showing how qemu-nbd and qemu-img convert write to same image:

1. Start qemu-nbd

$ qemu-nbd --socket /tmp/nbd.sock --format raw --persistent --export-name= --cache=none --aio=native dst.raw

2. In another shell, start qemu-img convert to qemu-nbd socket:

$ qemu-img convert -p -n -f raw -O raw -t none -T none src.raw nbd:unix:/tmp/nbd.sock
    (100.00/100%)

3. In another shell, start qemu-img convert to the same image:

$ qemu-img convert -p -n -f raw -O raw -t none -T none src.raw dst.raw
    (100.00/100%)

Comment 7 Kevin Wolf 2019-08-16 15:09:31 UTC
This could probably only be fixed if we could transfer the permissions/file locks over the NBD protocol.

If I understand correctly, the NBD server currently doesn't request exclusive write access to the image because the client might actually be okay with sharing the image and we don't want to make such use cases impossible. However, the client doesn't have a way to communicate that this isn't the case and an exclusive lock should be taken on the server.

Looks like something Eric might want to have a look at.

Comment 8 Nir Soffer 2019-08-16 22:57:09 UTC
Kevin, see also comment 5 - qemu-img convert cannot write to image used by qemu,
but it can write to an image used by another qemu-img convert.

For nbd, qemu-nbd has --read-only mode. If you open an image without --read-only
we can take an exclusive lock.

The exclusive lock does not prevent multiple clients from accessing qemu-nbd,
for example using 2 connections to transfer same image faster.

I think any qemu tool that opens an image for write should use exclusive lock.

Comment 9 Richard W.M. Jones 2019-08-17 06:52:08 UTC
Just please provide a way to opt out.

Comment 10 Eric Blake 2019-08-17 13:22:46 UTC
(In reply to Kevin Wolf from comment #7)
> This could probably only be fixed if we could transfer the permissions/file
> locks over the NBD protocol.

Supporting locks over the protocol is unlikely to happen (getting that correct is rather hairy).  But there are lighter-weight solutions.  The NBD protocol already has the notion of NBD_FLAG_CAN_MULTI_CONN, which is set when the server promises that multiple writers can safely share the same image.  qemu-nbd does NOT set this flag on writable images (and in fact, the patch to make it even report the flag on read-only images was just barely posted).  But what it does not have is a way to advertise whether there are other concurrent writers, or for a client to request on connect that it be an exclusive writer (and fail the connection if the server cannot honor that).  I've had ideas about how to add that to the protocol in the past; I'll look at reviving those ideas.

> If I understand correctly, the NBD server currently doesn't request
> exclusive write access to the image because the client might actually be
> okay with sharing the image and we don't want to make such use cases
> impossible. However, the client doesn't have a way to communicate that this
> isn't the case and an exclusive lock should be taken on the server.

Correct, it would have to be a protocol extension - both a way for the server to advertise if there are shared writers (so a client aware of that advertisement can disconnect when it finds it is not the lone writer), and a way for the client to request exclusive write access (if the server doesn't understand the request, the client disconnects; if the server does understand it, the client is ensured no other client will simultaneously access the image).  The idea definitely has merit.

Comment 11 Ben Cotton 2020-02-11 17:58:33 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.


Note You need to log in before you can comment on or make changes to this bug.