Bug 1179981

Summary: libvirtd segfault when reloading while starting up
Product: Red Hat Enterprise Linux 7 Reporter: Hao Liu <hliu>
Component: libvirtAssignee: Pavel Hrdina <phrdina>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, hliu, lhuang, mzhan, rbalakri
Target Milestone: rcKeywords: Upstream
Target Release: 7.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.13-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:07:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Hao Liu 2015-01-08 01:03:14 UTC
Description:
Segfault when reloading libvirtd while starting up

Product version
libvirt-1.2.8-11.el7.x86_64

How producible
10%

Steps:
1. Reload libvirtd while starting.
# systemctl restart libvirtd; systemctl reload libvirtd

2. Check ABRT
# abrt-cli list | head
id 094ba6a69277eef0d9d40f745db50ce27d7fb707
reason:         libvirtd killed by SIGSEGV
time:           Mon 05 Jan 2015 03:20:39 PM CST
cmdline:        /usr/sbin/libvirtd
package:        libvirt-daemon-1.2.8-11.el7
uid:            0 (root)
count:          5
Directory:      /var/tmp/abrt/ccpp-2015-01-05-15:20:39-10732
Run 'abrt-cli report /var/tmp/abrt/ccpp-2015-01-05-15:20:39-10732' for creating a case in Red Hat Customer Portal


Expected result:
No segfault

Backtrace(partial):
Thread 1 (Thread 0x7fd1d4a42880 (LWP 10732)):
#0  qemuConnectOpen (conn=0x7fd1d6325840, auth=<optimized out>, flags=<optimized out>) at qemu/qemu_driver.c:1114
#1  0x00007fd1d400689a in do_open (name=name@entry=0x7fd1bf0f05e2 "qemu:///system", auth=auth@entry=0x0, flags=flags@entry=0) at libvirt.c:1147
#2  0x00007fd1d40090d9 in virConnectOpen (name=name@entry=0x7fd1bf0f05e2 "qemu:///system") at libvirt.c:1317
#3  0x00007fd1bf0d4771 in storageDriverAutostart (driver=<optimized out>, driver=<optimized out>) at storage/storage_driver.c:85
#4  0x00007fd1bf0d4af1 in storageStateReload () at storage/storage_driver.c:237
#5  0x00007fd1d4008f38 in virStateReload () at libvirt.c:803
#6  0x00007fd1d4a910a7 in daemonReloadHandler (srv=srv@entry=0x7fd1d6244a80, sig=sig@entry=0x7fff36851fa0, opaque=opaque@entry=0x0) at libvirtd.c:807
#7  0x00007fd1d4ac1bfa in virNetServerSignalEvent (watch=watch@entry=2, fd=<optimized out>, events=events@entry=1, opaque=opaque@entry=0x7fd1d6244a80) at rpc/virnetserver.c:874
#8  0x00007fd1d3f48f8a in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>) at util/vireventpoll.c:510
#9  virEventPollRunOnce () at util/vireventpoll.c:660
#10 0x00007fd1d3f47672 in virEventRunDefaultImpl () at util/virevent.c:308
#11 0x00007fd1d4ac389d in virNetServerRun (srv=0x7fd1d6244a80) at rpc/virnetserver.c:1139
#12 0x00007fd1d4a905b8 in main (argc=<optimized out>, argv=<optimized out>) at libvirtd.c:1507
...
Thread 6 (Thread 0x7fd1bbf67700 (LWP 10744)):
#0  0x00007fd1d1131b7d in poll () from /lib64/libc.so.6
#1  0x00007fd1d3f375c2 in poll (__timeout=-1, __nfds=2, __fds=0x7fd1bbf66280) at /usr/include/bits/poll2.h:41
#2  virCommandProcessIO (cmd=cmd@entry=0x7fd1b4034000) at util/vircommand.c:2018
#3  0x00007fd1d3f3bb22 in virCommandRun (cmd=cmd@entry=0x7fd1b4034000, exitstatus=exitstatus@entry=0x0) at util/vircommand.c:2238
#4  0x00007fd1d3f81963 in virSysinfoRead () at util/virsysinfo.c:844
#5  0x00007fd1bd680155 in qemuStateInitialize (privileged=<optimized out>, callback=0x7fd1d4a90e50 <daemonInhibitCallback>, opaque=<optimized out>) at qemu/qemu_driver.c:662
#6  0x00007fd1d4008d4f in virStateInitialize (privileged=true, callback=callback@entry=0x7fd1d4a90e50 <daemonInhibitCallback>, opaque=opaque@entry=0x7fd1d6244a80) at libvirt.c:743
#7  0x00007fd1d4a90eab in daemonRunStateInit (opaque=opaque@entry=0x7fd1d6244a80) at libvirtd.c:917
#8  0x00007fd1d3f8464e in virThreadHelper (data=<optimized out>) at util/virthread.c:197
#9  0x00007fd1d1815df5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fd1d113c1ad in clone () from /lib64/libc.so.6
...

Comment 1 Hao Liu 2015-01-08 10:01:13 UTC
Tested with the following command on other version of libvirt.

1. On newest RHEL6 it work fine for at least several minutes.

# while (( 1 )); do service libvirtd restart; service libvirtd reload; virsh list; done

2. On RHEL7.0 with libvirt-1.1.1-29.el7.x86_64

# while ((1)); do systemctl reset-failed; systemctl restart libvirtd; systemctl reload libvirtd; virsh list; done

Most time its fine with following line logged:
 journal: internal error: qemu state driver is not active
But it also fails occasionally with:
Thread 8 (Thread 0x7f0e57909880 (LWP 15422)):
#0  0x00007f0e5403cac0 in _int_realloc () from /lib64/libc.so.6
#1  0x00007f0e5403d702 in realloc () from /lib64/libc.so.6
#2  0x00007f0e55e35f91 in xmlParseComment () from /lib64/libxml2.so.2
#3  0x00007f0e55e3f4f3 in xmlParseContent () from /lib64/libxml2.so.2
#4  0x00007f0e55e3fd33 in xmlParseElement () from /lib64/libxml2.so.2
#5  0x00007f0e55e404aa in xmlParseDocument () from /lib64/libxml2.so.2
#6  0x00007f0e55e40787 in xmlDoRead () from /lib64/libxml2.so.2
#7  0x00007f0e55eefbce in xmlRelaxNGParse () from /lib64/libxml2.so.2
#8  0x00007f0e415afc36 in rng_parse () from /lib64/libnetcf.so.1
#9  0x00007f0e415ae787 in ncf_init () from /lib64/libnetcf.so.1
#10 0x00007f0e417c26ba in netcfStateReload () at interface/interface_backend_netcf.c:130
#11 0x00007f0e56f57068 in virStateReload () at libvirt.c:902
#12 0x00007f0e57955492 in daemonReloadHandler (srv=srv@entry=0x7f0e59a40e10, sig=sig@entry=0x7fff4af18180, opaque=opaque@entry=0x0) at libvirtd.c:798
#13 0x00007f0e56fbd60a in virNetServerSignalEvent (watch=watch@entry=2, fd=<optimized out>, events=events@entry=1, opaque=opaque@entry=0x7f0e59a40e10) at rpc/virnetserver.c:881
#14 0x00007f0e56eb4a0d in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>) at util/vireventpoll.c:498
#15 virEventPollRunOnce () at util/vireventpoll.c:645
#16 0x00007f0e56eb316d in virEventRunDefaultImpl () at util/virevent.c:273
#17 0x00007f0e56fbf12d in virNetServerRun (srv=0x7f0e59a40e10) at rpc/virnetserver.c:1117
#18 0x00007f0e579549af in main (argc=<optimized out>, argv=<optimized out>) at libvirtd.c:1517
...
Thread 1 (Thread 0x7f0e3ef03700 (LWP 15434)):
#0  0x00007f0e54005a94 in vfprintf () from /lib64/libc.so.6
#1  0x00007f0e540ca495 in __vasprintf_chk () from /lib64/libc.so.6
#2  0x00007f0e415af573 in xasprintf () from /lib64/libnetcf.so.1
#3  0x00007f0e415af9b7 in parse_stylesheet () from /lib64/libnetcf.so.1
#4  0x00007f0e415b38a9 in drv_init () from /lib64/libnetcf.so.1
#5  0x00007f0e417c49ea in netcfStateInitialize (privileged=<optimized out>, callback=<optimized out>, opaque=<optimized out>) at interface/interface_backend_netcf.c:89
#6  0x00007f0e56f56e8a in virStateInitialize (privileged=true, callback=callback@entry=0x7f0e57955260 <daemonInhibitCallback>, opaque=opaque@entry=0x7f0e59a40e10) at libvirt.c:848
#7  0x00007f0e579552bb in daemonRunStateInit (opaque=opaque@entry=0x7f0e59a40e10) at libvirtd.c:908
#8  0x00007f0e56ee1f4e in virThreadHelper (data=<optimized out>) at util/virthreadpthread.c:194
#9  0x00007f0e5478cdf5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f0e540b31ad in clone () from /lib64/libc.so.6

Could it be a regression?

Comment 2 Pavel Hrdina 2015-02-18 15:25:07 UTC
Patch proposed upstream:

https://www.redhat.com/archives/libvir-list/2015-February/msg00643.html

Comment 3 Pavel Hrdina 2015-02-19 09:19:34 UTC
fixed upstream

commit 5c756e580f0ad4fd19f801e770d54167d1159162
Author: Pavel Hrdina <phrdina>
Date:
Wed Feb 18 16:10:58 2015 +0100

    daemon: Fix segfault by reloading daemon right after start

Comment 5 vivian zhang 2015-06-26 07:09:16 UTC
I can produce this bug with build libvirt-1.2.8-11.el7.x86_64

1. execute reload libvirtd while starting up libvirtd, libvirtd crashed

# while ((1)); do systemctl reset-failed; systemctl restart libvirtd; systemctl reload libvirtd; virsh list; done
...
 Id    Name                           State
----------------------------------------------------

error: failed to connect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

...

2. check core dump info from abrt
# abrt-cli list | head

The Autoreporting feature is disabled. Please consider enabling it by issuing
'abrt-auto-reporting enabled' as a user with root privileges
id e4059b9e7bceb686adcbf0a69ec06f112caeb00e
reason:         libvirtd killed by SIGSEGV
time:           Fri 26 Jun 2015 02:33:24 PM CST
cmdline:        /usr/sbin/libvirtd
package:        libvirt-daemon-1.2.8-11.el7
uid:            0 (root)
count:          1
Directory:      /var/tmp/abrt/ccpp-2015-06-26-14:33:24-22682
Run 'abrt-cli report /var/tmp/abrt/ccpp-2015-06-26-14:33:24-22682' for creating a case in Red Hat Customer Portal


3. check backtrace using gdb
#cd /var/tmp/abrt/ccpp-2015-06-26-14:33:24-22682
# gdb -c coredump
...


Verify this bug with build libvirt-1.2.16-1.el7.x86_64

execute reload libvirtd while starting up it for 10-20 minutes

#  while ((1)); do systemctl reset-failed; systemctl restart libvirtd; systemctl reload libvirtd; virsh list; done
 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

 Id    Name                           State
----------------------------------------------------
 18    vm1                            running

....


no libvirtd crash happened again

so move to verified

Comment 7 errata-xmlrpc 2015-11-19 06:07:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html