Red Hat Bugzilla – Bug 1373600
virtio-balloon stats virtqueue does not migrate properly
Last modified: 2017-08-01 23:29:59 EDT
Several issues related to the virtio-balloon stats virtqueue have been or are being addressed upstream. One was already fixed by Ladi Prosek in March: 4eae2a657d1ff5ada56eb9b4966eae0eff333b0b "balloon: fix segfault and harden the stats queue" He is also currently working on fixing a stats virtqueue hang after migration. These fixes all need to be backported to RHEL 7.3 and possibly 7.2.z.
As discussed with Ladi and Michael Tsirkin on IRC, assigning to Ladi.
The fixes have been merged upstream. Adding Romana to see if it's worth trying to get them into 7.3 proper or if we should defer to 7.3.z. This is a medium impact issue, not a regression. When migrating a VM with the virtio-balloon QEMU device with stats collection enabled, the stats queue stops working at the destination (i.e. no more stats collection stops after migration). It was independently found by several members of the community but we're not aware of any customers hitting this so far. Upstream commits to cherry-pick: 297a75e virtio: add virtqueue_rewind() 4a1e48b virtio-balloon: fix stats vq migration This commit mentioned in the description: 4eae2a6 balloon: fix segfault and harden the stats queue doesn't have to be ported. It is already included in the RHEV-7.3 tree and is not needed in RHEL-7.3 because it doesn't have the problematic commit that introduced the segfault.
Verify with same steps as https://bugzilla.redhat.com/show_bug.cgi?id=1402509#c7, after migration, the stats collection works well on dst host. Details: qemu-kvm-rhev-2.8.0-5.el7 kernel-3.10.0-558.el7.x86_64 QEMU cmdline: # /usr/libexec/qemu-kvm -m 4G rhel73-64-virtio.qcow2 -netdev tap,id=hostnet1 -device virtio-net-pci,mac=42:ce:a9:d2:4d:d9,id=idlbq7eA,netdev=hostnet1 -vnc :2 -monitor stdio -no-user-config -nodefaults -usb -device usb-tablet,id=input0 -vga qxl -qmp tcp:0:4444,server,nowait -device virtio-balloon-pci,id=balloon0,guest-stats-polling-interval=2 After migration, { "execute":"qom-get", "arguments":{"path":'/machine/peripheral/balloon0', "property": "guest-stats" } } {"return": {"stats": {"stat-swap-out": 0, "stat-available-memory": 3261628416, "stat-free-memory": 3145887744, "stat-minor-faults": 1082832, "stat-major-faults": 1213, "stat-total-memory": 3975217152, "stat-swap-in": 0}, "last-update": 1489393955}}
Moving to verified per comment 11.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392