RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 727249 - managedsave can crash libvirt
Summary: managedsave can crash libvirt
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Eric Blake
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 690175
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-01 16:34 UTC by Eric Blake
Modified: 2011-12-06 11:20 UTC (History)
6 users (show)

Fixed In Version: libvirt-0.9.4-0rc1.2.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 11:20:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description Eric Blake 2011-08-01 16:34:32 UTC
Description of problem:
See this upstream thread: https://www.redhat.com/archives/libvir-list/2011-July/msg01969.html

Running a loop of 'virsh managedsave dom && virsh start dom' while virt-manager is connected to libvirtd is able to crash libvirtd because both a sync job (the query commands used by virt-manager) and async job (the managed save) can end up trying to use the qemu monitor at the same time.

Version-Release number of selected component (if applicable):
libvirt-0.9.4-0rc2.el6.x86_64

How reproducible:
I was able to get a loop to crash within 10 iterations pre-patch; post patch got 20 iterations without failure.

Steps to Reproduce:
1. for i in `seq 20`; do virsh managedsave dom && virsh start dom || { echo failed on $i; break; }; done
2.
3.
  
Actual results:
error: Failed to save domain dom state
error: End of file while reading data: Input/output error
and libvirtd is crashed

Expected results:
no crash

Additional info:
proposed upstream patch:
https://www.redhat.com/archives/libvir-list/2011-July/msg02077.html
That thread mentioned another potential issue with killing libvirtd in the middle of a managed save, but I think it is a distinct issue and will be opening a second bz.

Comment 1 Eric Blake 2011-08-01 16:37:13 UTC
regression introduced in upstream commit 361842881e (after 0.9.3 but prior to 0.9.4-rc1).

Comment 3 Eric Blake 2011-08-01 16:49:53 UTC
see bug 727254 for another, less severe, issue noticed with managedsave while trying to work on the patch for this bug.

Comment 4 Eric Blake 2011-08-01 17:05:00 UTC
In POST:

commit 193cd0f3c879619619a3c35d25311e98693fe2ef
Author: Eric Blake <eblake>
Date:   Thu Jul 28 17:18:24 2011 -0600

    qemu: fix crash when mixing sync and async monitor jobs
    
    Currently, we attempt to run sync job and async job at the same time. It
    means that the monitor commands for two jobs can be run in any order.
    
    In the function qemuDomainObjEnterMonitorInternal():
        if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
            if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
    We check whether the caller is an async job by priv->job.active and
    priv->job.asynJob. But when an async job is running, and a sync job is
    also running at the time of the check, then priv->job.active is not
    QEMU_JOB_NONE. So we cannot check whether the caller is an async job
    in the function qemuDomainObjEnterMonitorInternal(), and must instead
    put the burden on the caller to tell us when an async command wants
    to do a nested job.

Comment 6 dyuan 2011-08-02 07:56:33 UTC
Reproduced this bug with libvirt-0.9.4-0rc2.el6 and verified pass with libvirt-0.9.4-0rc1.2.el6.

Comment 8 dyuan 2011-08-05 10:31:48 UTC
Moved it to VERIFIED according to comment 6.

Comment 9 errata-xmlrpc 2011-12-06 11:20:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.