RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 985334 - query mem info from monitor would cause qemu-kvm hang [RHEL-6.5]
Summary: query mem info from monitor would cause qemu-kvm hang [RHEL-6.5]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 909059
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-17 10:09 UTC by Laszlo Ersek
Modified: 2013-12-05 10:03 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.385.el6
Doc Type: Bug Fix
Doc Text:
Clone Of: 970047
Environment:
Last Closed: 2013-11-21 07:04:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:1553 0 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2013-11-20 21:40:29 UTC

Comment 2 Qunfang Zhang 2013-07-18 06:51:33 UTC
Hi, Laszlo

I can not reproduce this bug on rhel6.5 host.  According to the original reporter xwei, he did not reproduce on rhel6 host but can reproduce on rhel7 host. 
Do you mean there's problem in the rhel6.5 code as well so clone it? Could you give us some suggestion on how to verify it in future? 


Thanks,
Qunfang

Comment 3 Laszlo Ersek 2013-07-18 11:52:00 UTC
Honestly, I don't know. Remember that I could not reproduce this problem on RHEL-7 either.

The bug is caused by a huge number of monitor_flush() calls (for example incurred by the "info tlb" HMP command) so that most of these monitor_flush() calls actually fail to flush the monitor data to stdout, they are instead forced to queue the data. This depends very strongly on scheduling.

Can you maybe try this:

Step I:

(1) on terminal A: 

  mkfifo test.fifo

(2) on terminal B:

  sleep 1000 < test.fifo

(3) on terminal A:

  qemu [usual command line options, including -monitor stdio] \
  | tee test.fifo

  (monitor) cont
  (monitor) info tlb [repeatedly]

The idea is, "tee" will write the monitor output to both the terminal and to the fifo. Now the fifo is open for reading by "sleep", but "sleep" won't actually read data. Hence, once the FIFO is full (4KB), "tee" should block. After further 4KB of data (in total, 8KB) the pipe between "qemu" and "tee" should be full as well ("tee" being blocked), and qemu / monitor_flush() should start running into the situation described above.

Step II:

Unfortunately Step I. in itself is still not enough to reproduce the bug. The above suffices to create some extra watches for stdout-readiness notification, but we need not just "some", but so many of them, that g_poll() fails in the main loop with -1/EINVAL.

You can confirm that by witnessing the same hang as reported for RHEL-7 (actually, it's not a hang, the IO thread is spinning without progress).

If you manage to do Step I only (ie. the monitor output on the terminal stops, but the guest actually remains responsive via VNC or ssh), then please force qemu-kvm to dump core (*), and hopefully I'll be able to verify the problem by looking at it.

(*) Make sure you have core dumps enabled with ulimit, and send qemu-kvm a SIGABRT with "kill" -- in theory Ctrl-\ (= Ctrl-BackSlash), ie. an interactive SIGQUIT should work too, but maybe qemu catches it, I'm not sure.

Comment 4 Laszlo Ersek 2013-08-06 12:17:17 UTC
I found a way to reproduce this bug in RHEL-6.

(1) In terminal A, issue the following commands:

    mkfifo fifo.in fifo.out
    /usr/libexec/qemu-kvm -chardev pipe,id=fifo,path=fifo \
        -mon chardev=fifo,default

(2) In terminal B, issue the following command (same directory):

  cat fifo.out

(3) In terminal C, issue the following command (same directory):

  cat >fifo.in

(4) Still in terminal C, type the following command, and verify that its output appears in terminal B:

  info registers

(5) In terminal B, press ^Z (ie. stop (but do not kill) the "cat" process reading from "fifo.out").

(6) In terminal C, repeat the following command indefinitely (it's simples to keep pasting it from the clipboard):

  info registers

(7) At one point, the qemu-kvm process in terminal A dies, with the following message:

ERROR:/builddir/build/BUILD/qemu-kvm-0.12.1.2/vl.c:3942:glib_select_fill: assertion failed: (n_poll_fds <= ARRAY_SIZE(poll_fds))
Aborted

Comment 5 Laszlo Ersek 2013-08-06 12:36:12 UTC
(In reply to Laszlo Ersek from comment #4)

> (6) In terminal C, repeat the following command indefinitely (it's simples
> to keep pasting it from the clipboard):
> 
>   info registers
> 
> (7) At one point, the qemu-kvm process in terminal A dies, with the
> following message:
> 
> ERROR:/builddir/build/BUILD/qemu-kvm-0.12.1.2/vl.c:3942:glib_select_fill:
> assertion failed: (n_poll_fds <= ARRAY_SIZE(poll_fds))
> Aborted

In my testing, 128 "info registers" commands issued in step (6) are sufficient to trigger the bug.

Comment 11 Qunfang Zhang 2013-08-09 09:07:25 UTC
Reproduced this bug on qemu-kvm-0.12.1.2-2.382.el6 and verified pass on qemu-kvm-0.12.1.2-2.385.el6.

Steps:

(1) In terminal A, issue the following commands:

    mkfifo fifo.in fifo.out
    /usr/libexec/qemu-kvm -chardev pipe,id=fifo,path=fifo \
        -mon chardev=fifo,default

(2) In terminal B, issue the following command (same directory):

  cat fifo.out

(3) In terminal C, issue the following command (same directory):

  cat >fifo.in

(4) Still in terminal C, type the following command, and verify that its output appears in terminal B:

  info registers

(5) In terminal B, press ^Z (ie. stop (but do not kill) the "cat" process reading from "fifo.out").

(6) In terminal C, repeat the following command indefinitely (it's simples to keep pasting it from the clipboard):

  info registers

======================

Result:

On old qemu-kvm-0.12.1.2-2.382.el6, qemu process in terminal A died at the 90th "info registers" attempt and prompt:

[root@t2 home]# /usr/libexec/qemu-kvm -chardev pipe,id=fifo,path=fifo -mon chardev=fifo,default
VNC server running on `::1:5900'

**
ERROR:/builddir/build/BUILD/qemu-kvm-0.12.1.2/vl.c:3942:glib_select_fill: assertion failed: (n_poll_fds <= ARRAY_SIZE(poll_fds))
Aborted (core dumped)
[root@t2 home]# 


On fixed qemu-kvm-0.12.1.2-2.385.el6, qemu process does not die after 300 times "info registers" input.

So, this issue is fixed.

Comment 13 errata-xmlrpc 2013-11-21 07:04:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1553.html


Note You need to log in before you can comment on or make changes to this bug.