Bug 738778 - libvirtd crash during restart if running guest has <filterref>
Summary: libvirtd crash during restart if running guest has <filterref>
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.1
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
Depends On:
Blocks: 743047
TreeView+ depends on / blocked
Reported: 2011-09-15 18:48 UTC by Laine Stump
Modified: 2011-12-06 11:31 UTC (History)
8 users (show)

Fixed In Version: libvirt-0.9.4-12.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-12-06 11:31:35 UTC

Attachments (Terms of Use)
domain xml of the domain containing the filter reference that induces the crash. (1.98 KB, text/plain)
2011-09-16 01:20 UTC, Laine Stump
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description Laine Stump 2011-09-15 18:48:07 UTC

If libvirtd is restarted while there is a guest already running that has a <filterref> in its <interface> definition, it will get a segfault due to the nwfilter driver->nwfilters pointer being uninitialized.

How to reproduce:

1) add  "<filterref filter='clean-traffic'/>" to the <interface> section of a guest.

2) start the guest

3) from a root shell prompt on the host, run "/etc/init.d/libvirtd restart"

After the current libvirtd is stopped, the new libvirtd should crash during initialization.

How reproducible: 100% for me.

Here is an exemplary backtrace:

#0  virNWFilterObjFindByName (nwfilters=0x28, 
    name=0x7f7bec130190 "disallow-dhcp") at conf/nwfilter_conf.c:2169
#1  0x00000000004d1138 in __virNWFilterInstantiateFilter (conn=0x7f7bec1309b0, 
    teardownOld=true, ifname=0x7f7bec1301d0 "vnet0", ifindex=73, linkdev=0x0, 
    nettype=VIR_DOMAIN_NET_TYPE_NETWORK, macaddr=0x7f7bec130804 "RT", 
    filtername=0x7f7bec130190 "disallow-dhcp", filterparams=0x7f7bec1308b0, 
    useNewFilter=INSTANTIATE_ALWAYS, driver=0x0, forceWithPendingReq=false, 
    foundNewFilter=0x7f7bf0b36b4f) at nwfilter/nwfilter_gentech_driver.c:795
#2  0x00000000004d1a53 in _virNWFilterInstantiateFilter (conn=0x7f7bec1309b0, 
    net=0x7f7bec130800, teardownOld=true, useNewFilter=INSTANTIATE_ALWAYS, 
    foundNewFilter=0x7f7bf0b36b4f) at nwfilter/nwfilter_gentech_driver.c:913
#3  0x00000000004d1c2a in virNWFilterInstantiateFilter (
    conn=<value optimized out>, net=<value optimized out>)
    at nwfilter/nwfilter_gentech_driver.c:984
#4  0x0000000000484708 in qemuProcessFiltersInstantiate (
    opaque=<value optimized out>) at qemu/qemu_process.c:2258
#5  qemuProcessReconnect (opaque=<value optimized out>)
    at qemu/qemu_process.c:2578
#6  0x000000357c457512 in virThreadHelper (data=<value optimized out>)
    at util/threads-pthread.c:157
#7  0x00000035640077e1 in ?? ()
#8  0x00007f7bf0b37700 in ?? ()

Comment 2 Stefan Berger 2011-09-16 00:26:44 UTC
I have tried with libvirt 0.9.4 and don't see this happening at all. It looks like the nwfilters pointer is corrupted. Can you post the XML of your VM? Can you post the XML of the 'disallow-dhcp' filter, which I don't have on my system, and the filter referencing it.


Comment 3 Laine Stump 2011-09-16 01:20:39 UTC
Created attachment 523472 [details]
domain xml of the domain containing the filter reference that induces the crash.

I changed the domain xml to use the standard included "clean-traffic" filter, and the problem persists, so I'm sending just the domain xml (since the filter is part of the libvirt rpm).

Note that if the domain is not running when libvirtd starts, libvirtd *doesn't* crash if I then start the domain. So the pointer is only "improper" (whether it's corrupt or uninitialized) during virDomainLoadAllConfigs() - later on it is again back to normal.

Comment 4 Laine Stump 2011-09-16 14:18:19 UTC
Stefan found the problem and committed a fix upstream:

commit 3f2cb3ab595b3c185f6f814a5e2f46f4866b45a9
Author: Stefan Berger <stefanb@us.ibm.com>
Date:   Fri Sep 16 09:44:43 2011 -0400

    Fix buzzilla 738778
    This patch fixes the bug shown in bugzilla 738778. It's not an nwfilter
    problem but a connection sharing / closure issue.
    Depending on the speed / #CPUs of the machine you are using you may not
    see this bug all the time.

A more detailed explanation: qemuProcessReconnectAll opens a connection and starts several threads which may use the conn data, but then closes the conn without waiting for the threads to complete. The solution is to add an extra conn open before starting each thread, then have the threads close the conn when they are finished.

a rebased patch has been sent to rhvirt-patches for inclusion in RHEL6.


Comment 7 errata-xmlrpc 2011-12-06 11:31:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.