RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1446455 - libvirtd crashes on domain shutdown after reconnecting to a domain with a mediated host device attached
Summary: libvirtd crashes on domain shutdown after reconnecting to a domain with a med...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Erik Skultety
QA Contact: zhe peng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-28 05:44 UTC by Erik Skultety
Modified: 2017-08-02 01:32 UTC (History)
4 users (show)

Fixed In Version: libvirt-3.2.0-5.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-02 00:08:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description Erik Skultety 2017-04-28 05:44:19 UTC
Description of problem:
Having a running domain with a mediated host device attached, restarting libvirtd (thus reconnecting to the VM's qemu process) and then trying to shutdown the mentioned domain will result in a crash.

Version-Release number of selected component (if applicable):
libvirt-3.2.0

How reproducible:
always

Steps to Reproduce:
1. Prepare a domain with a mediated host device


2. start the VM
3. restart libvirtd (which will then reconnect to the qemu process)
4. shutdown the VM
5. daemon crashes
 virsh # error: Disconnected from qemu:///system due to keepalive timeout

Actual results:
libvirtd crashes

Expected results:
libvirtd doesn't crash

Additional info:
The reason why the daemon crashes is due to accessing a pointer which has previously been freed during the reconnect phase in 'virHostdevReAttachMediatedDevices', refer to the traceback.

#0  0x00007ffff3ccdeba in __strcmp_sse2_unaligned () from /lib64/libc.so.6
#1  0x00007ffff72a444a in virMediatedDeviceListFindIndex (list=0x7fff9c076590, 
    dev=0x7fff9c050980) at util/virmdev.c:388
#2  0x00007ffff72a4491 in virMediatedDeviceListFind (list=0x7fff9c076590, 
    dev=0x7fff9c050980) at util/virmdev.c:401
#3  0x00007ffff7241446 in virHostdevReAttachMediatedDevices (mgr=0x7fff9c07d0f0, 
    drv_name=0x7fffc60e6dd2 "QEMU", dom_name=0x7fff9c0e6c60 "f24-test", 
    hostdevs=0x7fff9c196420, nhostdevs=2) at util/virhostdev.c:2055
#4  0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices (driver=0x7fff9c08d420, 
    name=0x7fff9c0e6c60 "f24-test", hostdevs=0x7fff9c196420, nhostdevs=2)
    at qemu/qemu_hostdev.c:457
#5  0x00007fffc60216dc in qemuHostdevReAttachDomainDevices (driver=0x7fff9c08d420, 
    def=0x7fff9c04bbf0) at qemu/qemu_hostdev.c:480
#6  0x00007fffc6046e6f in qemuProcessStop (driver=0x7fff9c08d420, vm=0x7fff9c197d80, 
    reason=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=QEMU_ASYNC_JOB_NONE, flags=0)
    at qemu/qemu_process.c:6306
#7  0x00007fffc6091596 in processMonitorEOFEvent (driver=0x7fff9c08d420, 
    vm=0x7fff9c197d80) at qemu/qemu_driver.c:4566
#8  0x00007fffc6091793 in qemuProcessEventHandler (data=0x555555875870, 
    opaque=0x7fff9c08d420) at qemu/qemu_driver.c:4611
#9  0x00007ffff7294bf5 in virThreadPoolWorker (opaque=0x5555558697a0)
    at util/virthreadpool.c:167
#10 0x00007ffff7294184 in virThreadHelper (data=0x555555869340)
    at util/virthread.c:206
#11 0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff3d269cf in clone () from /lib64/libc.so.6

Comment 1 Erik Skultety 2017-04-28 05:48:31 UTC
> Steps to Reproduce:
> 1. Prepare a domain with a mediated host device

And here I meant to post the relevant domain XML snippet... 

...
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'>
  <source>
    <address uuid='e1aae5c9-85f8-437b-8a93-8615780b14ca'/>
  </source>
</hostdev>
...

Comment 2 Erik Skultety 2017-05-04 06:12:19 UTC
fixed upstream by:

commit 92e30a4dace54d06433f763e1acba0a81bb5c82e
Refs: v3.3.0-rc2-9-g92e30a4da
Author:     Erik Skultety <eskultet>
AuthorDate: Fri Apr 28 09:24:31 2017 +0200
Commit:     Erik Skultety <eskultet>
CommitDate: Thu May 4 08:05:03 2017 +0200

    mdev: Fix daemon crash on domain shutdown after reconnect

    The problem resides in virHostdevUpdateActiveMediatedDevices which gets
    called during qemuProcessReconnect. The issue here is that
    virMediatedDeviceListAdd takes a pointer to the item to be added to the
    list to which VIR_APPEND_ELEMENT is used, which also clears the pointer.
    However, in this case only the local copy of the pointer got cleared,
    leaving the original pointing to valid memory. To sum it up, during
    cleanup phase, the original pointer is freed and the daemon crashes
    basically any time it would access it.

    Backtrace:
    0x00007ffff3ccdeba in __strcmp_sse2_unaligned
    0x00007ffff72a444a in virMediatedDeviceListFindIndex
    0x00007ffff7241446 in virHostdevReAttachMediatedDevices
    0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices
    0x00007fffc60216dc in qemuHostdevReAttachDomainDevices
    0x00007fffc6046e6f in qemuProcessStop
    0x00007fffc6091596 in processMonitorEOFEvent
    0x00007fffc6091793 in qemuProcessEventHandler
    0x00007ffff7294bf5 in virThreadPoolWorker
    0x00007ffff7294184 in virThreadHelper
    0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
    0x00007ffff3d269cf in clone () from /lib64/libc.so.6

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1446455

    Signed-off-by: Erik Skultety <eskultet>
    Reviewed-by: Laine Stump <laine>

Comment 5 zhe peng 2017-05-25 08:43:24 UTC
I can reproduce this with build: libvirt-3.2.0-4.el7.x86_64
verify with build:
libvirt-3.2.0-6.el7.x86_64

step:
1. Prepare a domain with a mediated host device
....
 <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'>
      <source>
        <address uuid='9ea4eb07-c4a8-41d1-bd4a-850d1b0bef74'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
....
2. start the VM
3. restart libvirtd 
4. shutdown the VM
virsh # shutdown rhel7
Domain rhel7 is being shutdown

virsh # list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel7                          shut off

virsh # 

no crash occured, move to verified.

Comment 6 errata-xmlrpc 2017-08-02 00:08:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 7 errata-xmlrpc 2017-08-02 01:32:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.