Description of problem: In upstream, it was reported that virtiofsd stops responding when pausing and resuming the VM: - https://gitlab.com/virtio-fs/virtiofsd/-/issues/110 This is a regression, it works in the C version of virtiofsd How reproducible: 100% Steps to Reproduce: 1. Create a VM with virtiofs device 2. Boot the VM and mount virtiofs 3. virsh suspend vm 4. virsh resume vm Actual results: virtiofsd stops responding Expected results: virtiofsd should continue working Additional info: This is not a bug in virtiofsd, it was an error in one of our dependencies, the vhost-user-backend crate (rust-vmm): When the VM is stopped, GET_VRING_BASE is issued, and when it is resumed, SET_VRING_BASE will set the retrieved value. Because GET_VRING_BASE is resetting the state of the VQ, it fails to resume the operation. This is already fixed upstream: - https://github.com/rust-vmm/vhost/pull/154 - https://github.com/rust-vmm/vhost/pull/161 - https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/175
Windows driver also hit
I didn't reproduce this issue on the following env. Guest: rhel9.3 5.14.0-333.el9.x86_64 host: rhel9.3 5.14.0-324.el9.x86_64 qemu-kvm-8.0.0-5.el9.x86_64 virtio-win-prewhql-0.1-239 kernel-5.14.0-324.el9.x86_64 edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch Steps like comment 0: 1. Create a VM with virtiofs device # /usr/libexec/virtiofsd --shared-dir /home/test --socket-path /tmp/sock1 --lo│(qemu) boot-ovmf.sh: line 48: -chardev: command not found g-level debug 2. Boot the VM and mount virtiofs -chardev socket,id=char_virtiofs_fs,path=/tmp/sock1 \ -device vhost-user-fs-pci,id=vufs_virtiofs_fs,chardev=char_virtiofs_fs,tag=myfs,bus=pcie-root-port-3,addr=0x0 \ 3. stop and cont vm (qemu) stop (qemu) cont 4. check virtiofs in guest. Results: it works well, can read/write in virtiofs inside guest. Results
Hi German, could you help to check the steps above if you're available, thanks.
(In reply to xiagao from comment #3) > Hi German, could you help to check the steps above if you're available, > thanks. The steps are ok, but if you check the debug output of virtiofsd, you will see the virtiofsd will repeat the first operation in the VQ, until it "catch-up" with the entry in the guest, like in my tests: the uniq value is incremented with each operation, but here we keep using the number 20 [2023-07-17T14:49:21Z DEBUG virtiofsd] QUEUE_EVENT [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: opcode=Getattr (3), inode=1, unique=20, pid=847 [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: OutHeader { len: 120, error: 0, unique: 20 } [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: opcode=Getattr (3), inode=1, unique=20, pid=847 [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: OutHeader { len: 120, error: 0, unique: 20 } [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: opcode=Getattr (3), inode=1, unique=20, pid=847 ... [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: OutHeader { len: 120, error: 0, unique: 20 } [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: opcode=Getattr (3), inode=1, unique=20, pid=847 [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: OutHeader { len: 120, error: 0, unique: 20 } to make it failing, you should set a small queue-size in qemu, like: -device vhost-user-fs-pci,queue-size=16,... \
(In reply to German Maglione from comment #4) > (In reply to xiagao from comment #3) > > Hi German, could you help to check the steps above if you're available, > > thanks. > > The steps are ok, but if you check the debug output of virtiofsd, you will > see > the virtiofsd will repeat the first operation in the VQ, until it "catch-up" > with > the entry in the guest, like in my tests: the uniq value is incremented with > each > operation, but here we keep using the number 20 > > [2023-07-17T14:49:21Z DEBUG virtiofsd] QUEUE_EVENT > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: > opcode=Getattr (3), inode=1, unique=20, pid=847 > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: > OutHeader { len: 120, error: 0, unique: 20 } > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: > opcode=Getattr (3), inode=1, unique=20, pid=847 > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: > OutHeader { len: 120, error: 0, unique: 20 } > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: > opcode=Getattr (3), inode=1, unique=20, pid=847 > ... > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: > OutHeader { len: 120, error: 0, unique: 20 } > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Received request: > opcode=Getattr (3), inode=1, unique=20, pid=847 > [2023-07-17T14:49:21Z DEBUG virtiofsd::server] Replying OK, header: > OutHeader { len: 120, error: 0, unique: 20 } > > > to make it failing, you should set a small queue-size in qemu, like: > -device vhost-user-fs-pci,queue-size=16,... \ Thank you. With queue-size=16, can reproduce the problem.
Pre-verify this bz, as it works with virtiofsd-1.7 version.
It works with virtiofsd-1.7, so verify it.