Bug 1373600 - virtio-balloon stats virtqueue does not migrate properly
Summary: virtio-balloon stats virtqueue does not migrate properly
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev   
(Show other bugs)
Version: 7.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 7.4
Assignee: Ladi Prosek
QA Contact: Yumei Huang
URL:
Whiteboard:
Keywords: Regression, ZStream
Depends On:
Blocks: 1401400 1395265 1402509
TreeView+ depends on / blocked
 
Reported: 2016-09-06 17:04 UTC by Stefan Hajnoczi
Modified: 2017-08-02 03:29 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-rhev-2.8.0-1
Doc Type: Bug Fix
Doc Text:
Prior to this update, migrated guest virtual machines in some cases entered an inconsistent state and terminated unexpectedly after the migration finished due to incorrect handling of the virtqueue. With this update, virtqueue handling on migration is fixed, and no longer causes problems after guest migration.
Story Points: ---
Clone Of:
: 1402509 (view as bug list)
Environment:
Last Closed: 2017-08-01 23:34:44 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Stefan Hajnoczi 2016-09-06 17:04:54 UTC
Several issues related to the virtio-balloon stats virtqueue have been or are being addressed upstream.

One was already fixed by Ladi Prosek in March:
4eae2a657d1ff5ada56eb9b4966eae0eff333b0b "balloon: fix segfault and harden the stats queue"

He is also currently working on fixing a stats virtqueue hang after migration.

These fixes all need to be backported to RHEL 7.3 and possibly 7.2.z.

Comment 1 Stefan Hajnoczi 2016-09-06 17:06:41 UTC
As discussed with Ladi and Michael Tsirkin on IRC, assigning to Ladi.

Comment 3 Ladi Prosek 2016-09-13 13:07:43 UTC
The fixes have been merged upstream. Adding Romana to see if it's worth trying to get them into 7.3 proper or if we should defer to 7.3.z.

This is a medium impact issue, not a regression. When migrating a VM with the virtio-balloon QEMU device with stats collection enabled, the stats queue stops working at the destination (i.e. no more stats collection stops after migration). It was independently found by several members of the community but we're not aware of any customers hitting this so far.

Upstream commits to cherry-pick:
297a75e virtio: add virtqueue_rewind()
4a1e48b virtio-balloon: fix stats vq migration

This commit mentioned in the description:
4eae2a6 balloon: fix segfault and harden the stats queue
doesn't have to be ported. It is already included in the RHEV-7.3 tree and is not needed in RHEL-7.3 because it doesn't have the problematic commit that introduced the segfault.

Comment 11 Yumei Huang 2017-03-13 08:37:54 UTC
Verify with same steps as  https://bugzilla.redhat.com/show_bug.cgi?id=1402509#c7, after migration, the stats collection works well  on dst host. 

Details:
qemu-kvm-rhev-2.8.0-5.el7
kernel-3.10.0-558.el7.x86_64

QEMU cmdline:
# /usr/libexec/qemu-kvm -m 4G rhel73-64-virtio.qcow2  -netdev tap,id=hostnet1 -device virtio-net-pci,mac=42:ce:a9:d2:4d:d9,id=idlbq7eA,netdev=hostnet1 -vnc :2  -monitor stdio  -no-user-config -nodefaults  -usb -device usb-tablet,id=input0 -vga qxl    -qmp tcp:0:4444,server,nowait -device virtio-balloon-pci,id=balloon0,guest-stats-polling-interval=2

After migration, 
{ "execute":"qom-get", "arguments":{"path":'/machine/peripheral/balloon0', "property": "guest-stats" } }
{"return": {"stats": {"stat-swap-out": 0, "stat-available-memory": 3261628416, "stat-free-memory": 3145887744, "stat-minor-faults": 1082832, "stat-major-faults": 1213, "stat-total-memory": 3975217152, "stat-swap-in": 0}, "last-update": 1489393955}}

Comment 12 Yumei Huang 2017-03-13 08:47:38 UTC
Moving to verified per comment 11.

Comment 14 errata-xmlrpc 2017-08-01 23:34:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 01:12:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 02:04:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 02:45:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 03:09:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 19 errata-xmlrpc 2017-08-02 03:29:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.