Bug 1015108

Summary: Libvirtd crashes if firewalld is restarted while libvirtd is starting up
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Berrangé <berrange>
Component: libvirtAssignee: Daniel Berrangé <berrange>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: acathrow, berrange, clalancette, dallan, dyuan, itamar, jblengino, jforbes, laine, libvirt-maint, techbox, thomas.jethan, veillard, virt-maint, ydu, zhwang
Target Milestone: rc   
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard: abrt_hash:189c9c690641a1ea509dd083b910ee04db813aee
Fixed In Version: libvirt-1.1.1-9.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1014933 Environment:
Last Closed: 2014-06-13 11:28:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Berrangé 2013-10-03 13:09:34 UTC
+++ This bug was initially created as a clone of Bug #1014933 +++

Version-Release number of selected component:
libvirt-daemon-1.0.5.6-2.fc19

Additional info:
reporter:       libreport-2.1.7
backtrace_rating: 4
cmdline:        /usr/sbin/libvirtd
crash_function: virQEMUCloseCallbacksGetForConn
executable:     /usr/sbin/libvirtd
kernel:         3.11.2-201.fc19.i686.PAE
runlevel:       N 5
type:           CCpp
uid:            0

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 virQEMUCloseCallbacksGetForConn at qemu/qemu_conf.c:861
 #1 virQEMUCloseCallbacksRun at qemu/qemu_conf.c:890
 #2 qemuConnectClose at qemu/qemu_driver.c:1057
 #3 virConnectDispose at datatypes.c:159
 #4 virObjectUnref at util/virobject.c:264
 #5 virConnectClose at libvirt.c:1503
 #6 nwfilterStateReload at nwfilter/nwfilter_driver.c:301
 #7 nwfilterFirewalldDBusFilter at nwfilter/nwfilter_driver.c:90
 #9 virDBusWatchCallback at util/virdbus.c:144
 #10 virEventPollDispatchHandles at util/vireventpoll.c:500

--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:10 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:17 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:25 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:33 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:40 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:47 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:50:56 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:51:03 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:51:11 BST ---



--- Additional comment from Thomas H. Jethan on 2013-10-03 07:51:20 BST ---



--- Additional comment from Daniel Berrange on 2013-10-03 10:56:52 BST ---

Oh this is a really fun bug, and by 'fun' I mean insanely obscure.

In one thread we have the libvirtd daemon startup code running, and its in the middle of QEMU state initialization.

#9  0xb00882e4 in qemuStateInitialize (privileged=true, callback=0xb77a0420 <daemonInhibitCallback>, opaque=0xb8b1fc98) at qemu/qemu_driver.c:595
        driverConf = 0xaf5afcd8 "/etc/libvirt/qemu.conf"
        conn = 0x0
        ebuf = "\000\260\025\267\024\071P\257\214\000\000\000\360\316\341\257\335\242\023\267\214\000\000\000\210\177X\257\001\000\000\000l\000\000\000\360\316\341\257\000\260\025\267\264\316\341\257\210\177X\257$\316\341\257$\316\341\257l\000\000\000\304\316\341\257\201\321LRl\000\000\000\235R\022\267\000)\233\351\260\316\341\257\000\000\000\000\253G\022\267\000\260\025\267\340\316\341\257\a\000\000\000\v\260\023\267\000\260\025\267\001\000\000\000\254\325\334\266\000\260\025\267\214\261\023\267 :P\257\037:P\257\000\000\000\000/\261\023\267\000\260\025\267uc\334\266\000\260\025\267A\262\023\267\037:P\257\000\000\000\000\001\000\000\000\000\000\000\000\340\316\341\257\334\316\341\257\001\000\000\000\001\000\000\000\033c\024\267"...
        membase = 0x0
        mempath = 0x0
        cfg = 0xaf509050
        run_uid = 4294967295
        run_gid = 4294967295
        __func__ = "qemuStateInitialize"
        __FUNCTION__ = "qemuStateInitialize"
#10 0xb74c5325 in virStateInitialize (privileged=true, callback=callback@entry=0xb77a0420 <daemonInhibitCallback>, opaque=opaque@entry=0xb8b1fc98) at libvirt.c:833
        i = 6
        __func__ = "virStateInitialize"
#11 0xb77a049e in daemonRunStateInit (opaque=opaque@entry=0xb8b1fc98) at libvirtd.c:876
        srv = 0xb8b1fc98
        __func__ = "daemonRunStateInit"



In another thread, we have a dbus event being handled by the nwfilter driver, and the nwfilter driver calls into the QEMU driver....which has not finished initializing itself yet!
 

Thread 1 (Thread 0xb6366ac0 (LWP 7041)):
#0  0xb0052861 in virQEMUCloseCallbacksGetForConn (closeCallbacks=0x0, conn=0xb8b2cc20) at qemu/qemu_conf.c:861
        list = 0xb8ac57e8
        data = {conn = 0xb8b2cc20, list = 0xb8ac57e8, oom = false}
#1  virQEMUCloseCallbacksRun (closeCallbacks=0x0, conn=conn@entry=0xb8b2cc20, driver=0xaf50b350) at qemu/qemu_conf.c:890
        list = 0xb8b2cc20
        i = <optimized out>
        __func__ = "virQEMUCloseCallbacksRun"
#2  0xb009df3b in qemuConnectClose (conn=0xb8b2cc20) at qemu/qemu_driver.c:1057
        driver = <optimized out>
#3  0xb74babc1 in virConnectDispose (obj=0xb8b2cc20) at datatypes.c:159
        conn = 0xb8b2cc20
#4  0xb742f22c in virObjectUnref (anyobj=anyobj@entry=0xb8b2cc20) at util/virobject.c:264
        klass = 0xb8b2cba0
        obj = 0xb8b2cc20
        lastRef = true
        __func__ = "virObjectUnref"
#5  0xb74c5811 in virConnectClose (conn=conn@entry=0xb8b2cc20) at libvirt.c:1503
        __func__ = "virConnectClose"
        __FUNCTION__ = "virConnectClose"
#6  0xb023424e in nwfilterStateReload () at nwfilter/nwfilter_driver.c:301
        conn = 0xb8b2cc20
#7  0xb02342fc in nwfilterFirewalldDBusFilter (connection=0xaf501038, message=0xaf503910, user_data=0x0) at nwfilter/nwfilter_driver.c:90
        __func__ = "nwfilterFirewalldDBusFilter"
#8  0xb711efb9 in dbus_connection_dispatch (connection=0xaf501038) at dbus-connection.c:4631
        filter = <optimized out>
        next = 0x0
        message = 0xaf503910
        link = <optimized out>
        filter_list_copy = 0xaf5009dc
        message_link = 0xaf500a18
        result = DBUS_HANDLER_RESULT_NOT_YET_HANDLED
        pending = <optimized out>
        reply_serial = <optimized out>
        status = <optimized out>
        found_object = 3071507249
        __FUNCTION__ = "dbus_connection_dispatch"
#9  0xb740caeb in virDBusWatchCallback (fdatch=fdatch@entry=8, fd=15, events=1, opaque=0xaf500ca8) at util/virdbus.c:144
        watch = 0xaf500ca8
        info = 0xaf500de0
        dbus_flags = 1



This DBus event is triggered when the firewalld driver is reloaded, or restarted.

Thus what I think is happening here is that both libvirtd and firewalld have just been upgraded by yum. WHile libvirtd is finishing its restart, the firewalld reload occurs, causing the crash.

--- Additional comment from Daniel Berrange on 2013-10-03 10:57:13 BST ---



--- Additional comment from Daniel Berrange on 2013-10-03 10:57:24 BST ---



--- Additional comment from Daniel Berrange on 2013-10-03 11:06:18 BST ---

Yep, there were two updates for firewalld and libvirt which hit the repo at approx the same time


https://admin.fedoraproject.org/updates/FEDORA-2013-18032/firewalld-0.3.5-1.fc19
https://admin.fedoraproject.org/updates/FEDORA-2013-17618/libvirt-1.0.5.6-2.fc19


Can someone check their yum logs to see if they happened to have firewalld & libvirt updated on the same day. eg look at

# grep -E '(libvirt|firewalld)' /var/log/yum.log  | grep Installed

--- Additional comment from Dave Allan on 2013-10-03 13:50:02 BST ---

(In reply to Daniel Berrange from comment #14)
> Can someone check their yum logs to see if they happened to have firewalld &
> libvirt updated on the same day. eg look at

I didn't happen to hit this crash, but yes the libvirt and firewalld updates hit my system in the same yum update this morning.

Comment 2 Daniel Berrangé 2013-10-03 13:12:32 UTC
Patches posted upstream

https://www.redhat.com/archives/libvir-list/2013-October/msg00154.html


Note to QA, to reproduce the problem you need to do something like

  LIBVIRT_LOG_FILTERS=1:qemu LIBVIRT_LOG_OUTPUTS=1:stderr  /usr/sbin/libvirtd, 

and watch the log output. WHen you see it start to initialize the QEMU driver, run 'systemctl restart firewalld.service' in another shell

If you're lucky this could cause a crash. It is very susceptible to timing though - adding a 'sleep(10)' in the start of qemuStateInitialize method will help trigger it.

Comment 3 Daniel Berrangé 2013-10-03 13:14:04 UTC
NB this series also fixes a second problem - on architectures which do not have QEMU enabled, the nwfilter driver reload code does not work. This impacts the LXC driver on non-x86_64 archs in RHEL

Comment 6 yanbing du 2013-10-16 09:58:01 UTC
Following the note from comment 2, I can reproduce this bug.
I installed libvirt-1.1.1-8.el7.src.rpm and add 'sleep(10)' in the start of qemuStateInitialize method, compile it, then start the new libvirtd:

# ./daemon/libvirtd 
Segmentation fault (core dumped)

At the same time, restart firewalld in a loop, then libvirtd core dumped.

Using the fix libvirt version, libvirt-1.1.1-9.el7.src.rpm, redo the test, libvirtd will not crash, so this bug can move to VERIFIED.

Comment 7 Ludek Smid 2014-06-13 11:28:50 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.