Bug 1034312 - vdsm-tool segfaults during vdsmd start
Summary: vdsm-tool segfaults during vdsmd start
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 19
Hardware: Unspecified
OS: Unspecified
urgent
unspecified
Target Milestone: ---
Assignee: Cole Robinson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-25 15:11 UTC by Yedidyah Bar David
Modified: 2013-12-31 02:03 UTC (History)
23 users (show)

Fixed In Version: libvirt-1.0.5.8-1.fc19
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1039991 (view as bug list)
Environment:
Last Closed: 2013-12-31 02:03:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
logs-vdsm-issue-2013-11-26-01.tar.xz (90.51 KB, application/x-xz)
2013-11-26 08:05 UTC, Yedidyah Bar David
no flags Details

Description Yedidyah Bar David 2013-11-25 15:11:41 UTC
Description of problem:

During hosted-engine --deploy, '/bin/systemctl start vdsmd.service' fails.

journalctl -xn says:

Nov 25 16:42:21 didi-box1 kernel: vdsm-tool[14338]: segfault at 7f7306437ae0 ip 00007f7306437ae0 sp 00007f72fac6bf38 error 15 in libpython2.7.so.1.0[7f730641c000+3e000]
Nov 25 16:42:21 didi-box1 vdsmd_init_common.sh[14291]: /usr/libexec/vdsm/vdsmd_init_common.sh: line 70: 14337 Segmentation fault      "${VDSM_TOOL}" nwfilter
Nov 25 16:42:21 didi-box1 vdsmd_init_common.sh[14291]: vdsm: failed to execute nwfilter, error code 139
Nov 25 16:42:21 didi-box1 systemd[1]: vdsmd.service: control process exited, code=exited status=1

audit.log has:

type=ANOM_ABEND msg=audit(1385390541.172:919): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:initrc_t:s0 pid=14289 comm="vdsm-tool" reason="memory violation" sig=11
type=SERVICE_START msg=audit(1385390541.176:920): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? a
ddr=? terminal=? res=failed'
type=SERVICE_START msg=audit(1385390541.277:921): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? a
ddr=? terminal=? res=success'
type=SERVICE_STOP msg=audit(1385390541.277:922): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? ad
dr=? terminal=? res=success'
type=ANOM_ABEND msg=audit(1385390541.730:923): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:initrc_t:s0 pid=14338 comm="vdsm-tool" reason="memory violation" sig=11
type=SERVICE_START msg=audit(1385390541.733:924): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? a
ddr=? terminal=? res=failed'
type=SERVICE_START msg=audit(1385390541.834:925): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? a
ddr=? terminal=? res=success'
type=SERVICE_STOP msg=audit(1385390541.834:926): pid=1 uid=0 auid=4294967295 ses=4294967295  subj=system_u:system_r:init_t:s0 msg=' comm="vdsmd" exe="/usr/lib/systemd/systemd" hostname=? ad
dr=? terminal=? res=success'

Tell me if you need other logs.

Thanks.

Comment 1 Dan Kenigsberg 2013-11-25 15:45:04 UTC
How common is this? Could you reproduce and get a hold of a coredump, with python's and libvirt's debug information installed?

A python script's segfault is most probably not a bug in the script, but a bug in python or one of its modules (this time, I suspect libvirt's python binding).

Comment 2 Yaniv Bronhaim 2013-11-25 16:37:29 UTC
It happens as part of running nwfilter operation which must be somehow relates to libvirt, because that's all it does- setting libvirt network filter using libvirtconnection

maybe libvirt.log can tell us more ?
can you reproduce it or it happened only once?

Comment 3 Yedidyah Bar David 2013-11-26 08:01:52 UTC
(In reply to Dan Kenigsberg from comment #1)
> How common is this?

Quite common. Does not always happen, but if I stop both vdsmd and libvirtd before running deploy, it happens around 80% of the times.

> Could you reproduce and get a hold of a coredump, with
> python's and libvirt's debug information installed?

I'll try.

> 
> A python script's segfault is most probably not a bug in the script, but a
> bug in python or one of its modules (this time, I suspect libvirt's python
> binding).

Indeed.

Comment 4 Yedidyah Bar David 2013-11-26 08:05:03 UTC
Created attachment 829093 [details]
logs-vdsm-issue-2013-11-26-01.tar.xz

Comment 5 Yedidyah Bar David 2013-11-26 08:06:39 UTC
(In reply to Yaniv Bronhaim from comment #2)
> It happens as part of running nwfilter operation which must be somehow
> relates to libvirt, because that's all it does- setting libvirt network
> filter using libvirtconnection
> 
> maybe libvirt.log can tell us more ?
> can you reproduce it or it happened only once?

Attached logs of two attempts. First did not fail, second did fail. Moved the logs before starting to get less noise. Each attempt consisted of:
service vdsmd stop
service libvirtd stop
hosted-engine --deploy

Comment 6 Yedidyah Bar David 2013-11-26 10:40:53 UTC
I rebooted the machine and now can't reproduce.

I guess we can close the bug and reopen if it happens again.

Comment 7 Sandro Bonazzola 2013-11-29 14:03:39 UTC
Happened to me today on F19:

 vdsm-4.13.0-196.gitb871832.fc19.x86_64
 libvirt-1.0.5.7-2.fc19.x86_64

Reproduced by command line:

 # /usr/libexec/vdsm/vdsmd_init_common.sh --pre-start
 vdsm: Running mkdirs
 vdsm: Running configure_coredump
 vdsm: Running run_init_hooks
 vdsm: Running gencerts
 vdsm: Running check_is_configured
 libvirt is already configured for vdsm
 sanlock service is already configured
 vdsm: Running validate_configuration
 SUCCESS: ssl configured to true. No conflicts
 vdsm: Running prepare_transient_repository
 vdsm: Running syslog_available
 vdsm: Running nwfilter
 /usr/libexec/vdsm/vdsmd_init_common.sh: line 70:  8708 Errore di segmentazione "${VDSM_TOOL}" nwfilter
 vdsm: failed to execute nwfilter, error code 139

dmesg:
[ 2099.578442] python[11612]: segfault at 340b38aa40 ip 000000340b38aa40 sp 00007fb5c0e6bf38 error 15 in libpython2.7.so.1.0[340b375000+3e000]


dmesg shows a lot of segfaults at boot:


[ 1068.633830] vdsm-tool[7289]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f0d68e73f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1069.185782] vdsm-tool[7338]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f7923478f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1069.729556] vdsm-tool[7387]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fc75931ff38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1070.272053] vdsm-tool[7436]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f8d07489f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1070.814521] vdsm-tool[7485]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f963c531f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1115.928107] vdsm-tool[7580]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f7ac17f4f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1116.479487] vdsm-tool[7629]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f078d3def38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1117.030330] vdsm-tool[7678]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f6bd0a7af38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1117.572859] vdsm-tool[7727]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f932b298f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1118.115472] vdsm-tool[7776]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f479ac9df38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1148.357380] vdsm-tool[7906]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f927c6aaf38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1148.908668] vdsm-tool[7956]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007feb38adff38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1149.459646] vdsm-tool[8005]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fd998e78f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1150.002007] vdsm-tool[8054]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f91a8ba8f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1150.536238] vdsm-tool[8103]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fccdc4e2f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1169.678838] vdsm-tool[8178]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fbe86890f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1170.230246] vdsm-tool[8227]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007ffbdefc7f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1170.772681] vdsm-tool[8276]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f70a2fc0f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1171.306788] vdsm-tool[8326]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fcec9289f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1171.850647] vdsm-tool[8375]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f6c13978f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1460.549896] vdsm-tool[8458]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f4ca433bf38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1461.100880] vdsm-tool[8508]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f9ed705df38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1461.677005] vdsm-tool[8557]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fce2fa2ff38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1462.227666] vdsm-tool[8606]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f52b859af38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1462.795247] vdsm-tool[8655]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f3f9e3f5f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1478.736661] vdsm-tool[8709]: segfault at 7 ip 0000000000000007 sp 00007f75f4218f38 error 14 in python2.7[400000+1000]
[ 1521.334953] vdsm-tool[8891]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f1bb1cfef38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1521.886324] vdsm-tool[8940]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fb6b658bf38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1522.445516] vdsm-tool[8990]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007ff62840ef38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1522.988032] vdsm-tool[9039]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007ff41e248f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1523.530568] vdsm-tool[9088]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007fe11b9d6f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1551.895983] vdsm-tool[9289]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f77bdf98f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1552.447087] vdsm-tool[9339]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f6e6ee45f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1552.991642] vdsm-tool[9388]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f3bc826ff38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1553.542483] vdsm-tool[9437]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f0a71c61f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]
[ 1554.085051] vdsm-tool[9486]: segfault at 340b390ae0 ip 000000340b390ae0 sp 00007f9976c78f38 error 15 in libpython2.7.so.1.0[340b375000+3e000]

Comment 8 Sandro Bonazzola 2013-11-29 14:43:49 UTC
gdb:

 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 Core was generated by `/usr/bin/python /usr/bin/vdsm-tool nwfilter'.
 Program terminated with signal 11, Segmentation fault.
 #0  0x0000000000000007 in ?? ()
 (gdb) py-bt
 #13 Frame 0x7f72680010b0, for file /usr/lib64/python2.7/site-packages/libvirt.py, line 299, in virEventRunDefaultImpl ()
    ret = libvirtmod.virEventRunDefaultImpl()

(gdb) bt
 #0  0x0000000000000007 in ?? ()
 #1  0x000000312480c2ce in _sasl_log (conn=<optimized out>, level=5, fmt=0x7f7267398218 "DIGEST-MD5 client mech dispose") at common.c:1985
 #2  0x00007f7267391d38 in digestmd5_client_mech_dispose (conn_context=0x143c5b0, utils=0x10ddec0) at digestmd5.c:4580
 #3  0x00000031248084c4 in client_dispose (pconn=0x10dcac0) at client.c:332
 #4  0x000000312480b4fb in sasl_dispose (pconn=0xe63338) at common.c:851
 #5  0x0000003411885a3b in virObjectUnref (anyobj=<optimized out>) at util/virobject.c:264
 #6  0x0000003411970908 in virNetSocketDispose (obj=0xe5dc50) at rpc/virnetsocket.c:1026
 #7  0x0000003411885a3b in virObjectUnref (anyobj=<optimized out>) at util/virobject.c:264
 #8  0x000000341186d182 in virEventPollCleanupHandles () at util/vireventpoll.c:582
 #9  0x000000341186e3cf in virEventPollRunOnce () at util/vireventpoll.c:651
 #10 0x000000341186cd0d in virEventRunDefaultImpl () at util/virevent.c:273
 #11 0x00007f726ce6cd06 in libvirt_virEventRunDefaultImpl (self=<optimized out>, args=<optimized out>) at libvirt.c:3076
 #12 0x000000340b0ddcee in call_function (oparg=<optimized out>, pp_stack=0x7f726ca02390) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4098
 #13 PyEval_EvalFrameEx (f=f@entry=Frame 0x7f72680010b0, for file /usr/lib64/python2.7/site-packages/libvirt.py, line 299, in virEventRunDefaultImpl (), throwflag=throwflag@entry=0)
    at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740

(gdb) t 2
 [Switching to thread 2 (Thread 0x7f72771b5740 (LWP 3295))]
 #0  _PyType_Lookup (type=type@entry=0x340b3875c0 <PyInt_Type>, name=name@entry='__class__') at /usr/src/debug/Python-2.7.5/Objects/typeobject.c:2518
2518	            dict = ((PyTypeObject *)base)->tp_dict;
 (gdb) py-bt
 #10 Frame 0x14b5e30, for file /usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py, line 78, in wrapper (args=(), kwargs={}, ret=0)
    if isinstance(ret, libvirt.virDomain):
 #14 Frame 0x13f73e0, for file /usr/lib64/python2.7/site-packages/vdsm/tool/nwfilter.py, line 35, in main (conn=<virConnect(restore=<function at remote 0x140a8c0>, nodeDeviceLookupSCSIHostByWWN=<function at remote 0x140a0c8>, listAllSecrets=<function at remote 0x1406398>, saveImageGetXMLDesc=<function at remote 0x140aa28>, virConnGetLastError=<function at remote 0x1433230>, listDomainsID=<function at remote 0x14066e0>, storagePoolLookupByUUIDString=<function at remote 0x140af50>, getCPUMap=<function at remote 0x13b56e0>, setKeepAlive=<function at remote 0x140ac80>, pingLibvirt=<instancemethod at remote 0xf60a00>, changeBegin=<function at remote 0x1432ed8>, dispatchDomainEventBlockPullCallback=<function at remote 0x13b5320>, _o=None, saveImageDefineXML=<function at remote 0x140a9b0>, listDefinedDomains=<function at remote 0x1406488>, getCapabilities=<function at remote 0x13b57d0>, getFreeMemory=<function at remote 0x13b58c0>, numOfDomains=<function at remote 0x140a398>, networkCreateXML=<function at remote 0x1406c8...(truncated)
    conn.close()
 #19 Frame 0xe62d90, for file /usr/bin/vdsm-tool, line 142, in main (opts=[], args=['nwfilter'], cmd='nwfilter')
    return tool_command[cmd]["command"](*args[1:])
 #22 Frame 0xe3d530, for file /usr/bin/vdsm-tool, line 145, in <module> ()
    sys.exit(main())

 bt
 #0  _PyType_Lookup (type=type@entry=0x340b3875c0 <PyInt_Type>, name=name@entry='__class__') at /usr/src/debug/Python-2.7.5/Objects/typeobject.c:2518
 #1  0x000000340b0847a5 in _PyObject_GenericGetAttrWithDict (obj=0, name='__class__', dict=0x0) at /usr/src/debug/Python-2.7.5/Objects/object.c:1389
 #2  0x000000340b046a2e in recursive_isinstance (inst=0, cls=<type at remote 0x1363850>) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2892
 #3  0x000000340b04af55 in _PyObject_RealIsInstance (inst=<optimized out>, cls=<optimized out>) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:3060
 #4  0x000000340b09caf2 in type___instancecheck__ (type=<optimized out>, inst=<optimized out>) at /usr/src/debug/Python-2.7.5/Objects/typeobject.c:595
 #5  0x000000340b049dd3 in PyObject_Call (func=func@entry=<built-in method __instancecheck__ of type object at remote 0x1363850>, arg=arg@entry=(0,), kw=kw@entry=0x0)
    at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2529
 #6  0x000000340b04a65f in PyObject_CallFunctionObjArgs (callable=callable@entry=<built-in method __instancecheck__ of type object at remote 0x1363850>)
    at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2760
 #7  0x000000340b04ac5f in PyObject_IsInstance (inst=0, cls=<type at remote 0x1363850>) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2963
 #8  0x000000340b0d50c9 in builtin_isinstance (self=<optimized out>, args=<optimized out>) at /usr/src/debug/Python-2.7.5/Python/bltinmodule.c:2452
 #9  0x000000340b0ddcee in call_function (oparg=<optimized out>, pp_stack=0x7fffaed4b120) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4098
 #10 PyEval_EvalFrameEx (f=f@entry=Frame 0x14b5e30, for file /usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py, line 78, in wrapper (args=(), kwargs={}, ret=0), 
    throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740

Comment 9 Yaniv Bronhaim 2013-12-01 17:17:51 UTC
Dan, this relates to nwfilter which uses libvirt. This looks like the segfault happens internally in libvirt code. doesn't it?

Sandro, reproducible or just happened?

Comment 10 Dan Kenigsberg 2013-12-02 09:38:51 UTC
Jiri, could you take a look at the traceback? Even if we are doing something deeply wrong (what?) libvirtmod should not end up segfaulting.

Comment 11 Jiri Denemark 2013-12-02 13:13:49 UTC
The backtrace looks very similar to the following commit (v1.2.0-rc1-3-g13fdc6d):

commit 13fdc6d63ef64f8e231a087e1dab7d90145c3c10
Author: Christophe Fergeau <cfergeau>
Date:   Fri Nov 22 17:27:21 2013 +0100

    Tie SASL callbacks lifecycle to virNetSessionSASLContext
    
    The array of sasl_callback_t callbacks which is passed to sasl_client_new()
    must be kept alive as long as the created sasl_conn_t object is alive as
    cyrus-sasl uses this structure internally for things like logging, so
    the memory used for callbacks must only be freed after sasl_dispose() has
    been called.
    
    During testing of successful SASL logins with
    virsh -c qemu+tls:///system list --all
    I've been getting invalid read reports from valgrind
    
    ==9237== Invalid read of size 8
    ==9237==    at 0x6E93B6F: _sasl_getcallback (common.c:1745)
    ==9237==    by 0x6E95430: _sasl_log (common.c:1850)
    ==9237==    by 0x16593D87: digestmd5_client_mech_dispose (digestmd5.c:4580)
    ==9237==    by 0x6E91653: client_dispose (client.c:332)
    ==9237==    by 0x6E9476A: sasl_dispose (common.c:851)
    ==9237==    by 0x4E225A1: virNetSASLSessionDispose (virnetsaslcontext.c:678)
    ==9237==    by 0x4CBC551: virObjectUnref (virobject.c:262)
    ==9237==    by 0x4E254D1: virNetSocketDispose (virnetsocket.c:1042)
    ==9237==    by 0x4CBC551: virObjectUnref (virobject.c:262)
    ==9237==    by 0x4E2701C: virNetSocketEventFree (virnetsocket.c:1794)
    ==9237==    by 0x4C965D3: virEventPollCleanupHandles (vireventpoll.c:583)
    ==9237==    by 0x4C96987: virEventPollRunOnce (vireventpoll.c:652)
    ==9237==    by 0x4C94730: virEventRunDefaultImpl (virevent.c:274)
    ==9237==    by 0x12C7BA: vshEventLoop (virsh.c:2407)
    ==9237==    by 0x4CD3D04: virThreadHelper (virthreadpthread.c:161)
    ==9237==    by 0x7DAEF32: start_thread (pthread_create.c:309)
    ==9237==    by 0x8C86EAC: clone (clone.S:111)
    ==9237==  Address 0xe2d61b0 is 0 bytes inside a block of size 168 free'd
    ==9237==    at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==9237==    by 0x4C73827: virFree (viralloc.c:580)
    ==9237==    by 0x4DE4BC7: remoteAuthSASL (remote_driver.c:4219)
    ==9237==    by 0x4DE33D0: remoteAuthenticate (remote_driver.c:3639)
    ==9237==    by 0x4DDBFAA: doRemoteOpen (remote_driver.c:832)
    ==9237==    by 0x4DDC8DC: remoteConnectOpen (remote_driver.c:1031)
    ==9237==    by 0x4D8595F: do_open (libvirt.c:1239)
    ==9237==    by 0x4D863F3: virConnectOpenAuth (libvirt.c:1481)
    ==9237==    by 0x12762B: vshReconnect (virsh.c:337)
    ==9237==    by 0x12C9B0: vshInit (virsh.c:2470)
    ==9237==    by 0x12E9A5: main (virsh.c:3338)
    
    This commit changes virNetSASLSessionNewClient() to take ownership of the SASL
    callbacks. Then we can free them in virNetSASLSessionDispose() after the corresponding
    sasl_conn_t has been freed.

Comment 12 Dan Kenigsberg 2013-12-02 13:37:55 UTC
The bug appears in libvirt version: 1.0.5.7, package: 2.fc19 (Fedora Project, 2013-11-17-23:21:57, buildvm-18.phx2.fedoraproject.org).

Comment 13 Sandro Bonazzola 2013-12-02 14:18:13 UTC
(In reply to Yaniv Bronhaim from comment #9)
> Dan, this relates to nwfilter which uses libvirt. This looks like the
> segfault happens internally in libvirt code. doesn't it?
> 
> Sandro, reproducible or just happened?

reproducible quite often especially just after the boot.

Comment 14 Cole Robinson 2013-12-02 17:47:08 UTC
I'll look into backporting it for F19

Comment 15 Yedidyah Bar David 2013-12-08 15:36:16 UTC
A similar thing happened to me today too while running 'hosted-engine --deploy'. Backtrace in gdb looked similar, but I did not manage to reproduce with debuginfo.

This time hosted-engine itself (python running otopi) died with segfault.

I did not understand from the previous comments the status of this bug. Is comment 11 referring to what might be the root cause, or to a solution? What will be backported (comment 14)? There are no links in the "external trackers" section.

Comment 16 Yaniv Bronhaim 2013-12-09 13:40:13 UTC
Currently our requires for libvirt packages are:

%if 0%{?rhel} >= 7 || 0%{?fedora} >= 18
Requires: libvirt-daemon >= 1.0.2-1
Requires: libvirt-daemon-config-nwfilter
Requires: libvirt-daemon-driver-network
Requires: libvirt-daemon-driver-nwfilter
Requires: libvirt-daemon-driver-qemu
%else
%if 0%{?rhel}
Requires: libvirt >= 0.10.2-29.el6
%else
Requires: libvirt >= 1.0.2-1
%endif

If this was fixed in 1.0.5.7 we still have it in vdsm, so the bug is not in POST yet. Where currently is libvirt 1.0.5.7 available?

Comment 17 Yaniv Bronhaim 2013-12-09 13:42:36 UTC
Excuse me, libvirt still doesn't have a release with that fix iiuc. Jiri, which release will contain it?

Comment 18 Jiri Denemark 2013-12-09 16:16:26 UTC
My comment 11 mentioned a likely solution for this bug. This bug is in POST as it is fixed upstream. The patch just needs to be backported to Fedora 19 (Cole, when do you plan on making a new build with this patch in?) and we probably want to clone this bug for RHEL too.

Comment 19 Cole Robinson 2013-12-09 21:23:06 UTC
I'll push a build this week.

Comment 20 Fedora Update System 2013-12-14 21:19:38 UTC
libvirt-1.0.5.8-1.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/libvirt-1.0.5.8-1.fc19

Comment 21 Fedora Update System 2013-12-16 22:55:50 UTC
Package libvirt-1.0.5.8-1.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libvirt-1.0.5.8-1.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-23453/libvirt-1.0.5.8-1.fc19
then log in and leave karma (feedback).

Comment 22 Fedora Update System 2013-12-31 02:03:21 UTC
libvirt-1.0.5.8-1.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.