RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1033061 - [abrt] libvirt-daemon-1.1.1-12.el7: xsltFreeExtDef: Process /usr/sbin/libvirtd was killed by signal 11 (SIGSEGV)
Summary: [abrt] libvirt-daemon-1.1.1-12.el7: xsltFreeExtDef: Process /usr/sbin/libvirt...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard: abrt_hash:b60cc3ca4585400bb07667cd9b8...
: 966449 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-21 13:50 UTC by Tomas Dolezal
Modified: 2014-11-05 12:06 UTC (History)
7 users (show)

Fixed In Version: libvirt-1.1.1-14.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 12:15:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
File: backtrace (31.23 KB, text/plain)
2013-11-21 13:50 UTC, Tomas Dolezal
no flags Details
File: cgroup (158 bytes, text/plain)
2013-11-21 13:50 UTC, Tomas Dolezal
no flags Details
File: core_backtrace (31.69 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: dso_list (8.80 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: environ (72 bytes, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: limits (1.29 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: maps (41.96 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: open_fds (183 bytes, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: proc_pid_status (1.03 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details
File: var_log_messages (12.14 KB, text/plain)
2013-11-21 13:51 UTC, Tomas Dolezal
no flags Details

Description Tomas Dolezal 2013-11-21 13:50:44 UTC
Description of problem:
restarted libvirtd.service because of  inability to communicate with vm over network

Version-Release number of selected component:
libvirt-daemon-1.1.1-12.el7

Additional info:
reporter:       libreport-2.1.7
backtrace_rating: 4
cmdline:        /usr/sbin/libvirtd
crash_function: xsltFreeExtDef
executable:     /usr/sbin/libvirtd
kernel:         3.10.0-50.el7.x86_64
runlevel:       N 5
type:           CCpp
uid:            0

Truncated backtrace:
Thread no. 1 (10 frames)
 #1 xsltFreeExtDef at extensions.c:137
 #2 xsltFreeExtDefList at extensions.c:157
 #3 xsltFreeExts at extensions.c:482
 #4 xsltFreeStylesheet at xslt.c:964
 #5 drv_close at drv_redhat.c:383
 #6 ncf_close at netcf.c:101
 #7 virObjectUnref at util/virobject.c:262
 #8 netcfInterfaceClose at interface/interface_backend_netcf.c:265
 #9 virConnectDispose at datatypes.c:149
 #10 virObjectUnref at util/virobject.c:262

Comment 1 Tomas Dolezal 2013-11-21 13:50:51 UTC
Created attachment 827186 [details]
File: backtrace

Comment 2 Tomas Dolezal 2013-11-21 13:50:56 UTC
Created attachment 827187 [details]
File: cgroup

Comment 3 Tomas Dolezal 2013-11-21 13:51:02 UTC
Created attachment 827188 [details]
File: core_backtrace

Comment 4 Tomas Dolezal 2013-11-21 13:51:07 UTC
Created attachment 827189 [details]
File: dso_list

Comment 5 Tomas Dolezal 2013-11-21 13:51:13 UTC
Created attachment 827190 [details]
File: environ

Comment 6 Tomas Dolezal 2013-11-21 13:51:17 UTC
Created attachment 827191 [details]
File: limits

Comment 7 Tomas Dolezal 2013-11-21 13:51:30 UTC
Created attachment 827194 [details]
File: maps

Comment 8 Tomas Dolezal 2013-11-21 13:51:36 UTC
Created attachment 827197 [details]
File: open_fds

Comment 9 Tomas Dolezal 2013-11-21 13:51:43 UTC
Created attachment 827199 [details]
File: proc_pid_status

Comment 10 Tomas Dolezal 2013-11-21 13:51:49 UTC
Created attachment 827201 [details]
File: var_log_messages

Comment 11 Tomas Dolezal 2013-11-21 13:52:00 UTC
Created attachment 827205 [details]
File: sosreport.tar.xz

Comment 13 Michal Privoznik 2013-11-26 17:50:48 UTC
Just for the record; it seems like we're hitting a gcc bug here. From the stack trace we can clearly see virObjectUnref() is called from two separate threads over the very same object (obj = 0x7f4968018260). Both threads got reference counter equal to zero and hence started executing destroy callback.

The code snippet that is responsible for atomic dec & test:

#  define virAtomicIntDecAndTest(atomic)                                \
    (__extension__ ({                                                   \
            (void)verify_true(sizeof(*(atomic)) == sizeof(int));        \
            (void)(0 ? *(atomic) ^ *(atomic) : 0);                      \
            __sync_fetch_and_sub((atomic), 1) == 1;                     \
        }))

This function is supposed to return true if the callee held the last reference to an object. Otherwise, false is expected. However, it seems like __sync_fetch_and_sub() is not that atomic as one might think as it returned 1 in both threads. I'm investigating this further and I'll keep the BZ updated.

Comment 14 Michal Privoznik 2013-11-28 12:19:18 UTC
Tomas, can you please attach debug logs http://wiki.libvirt.org/page/DebugLogs  and give me steps to reproduce? I'm unable to reproduce this myslef.

Comment 15 Michal Privoznik 2013-11-29 11:06:36 UTC
I *think* this patch may help me to understand what's going on. I've built a scratch build for you to test (if you have some spare time):

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=6664683

Please make sure you've got debug logs turned on when reproducing the bug:

http://wiki.libvirt.org/page/DebugLogs

Comment 16 Tomas Dolezal 2013-11-29 15:09:25 UTC
Hi Michal,
unfortunately this abrt-caught bug's reproducer is unknown to me. It happened just after restarting libvirtd.service. I did it because networking between host&guest was not working at all. Neither the restart helped as it usually does. All went running again after reboot.

Comment 17 Michal Privoznik 2013-12-02 17:21:02 UTC
Nevermind. With couple of tries (and intentionally inserted sleep()-s into the right places) I was able to reproduce it. And I think I know what's going on. Unfortunatelly, I don't have a design how to fix it. Neiter has the upstream:

http://www.redhat.com/archives/libvir-list/2013-October/msg00754.html

Comment 18 Michal Privoznik 2013-12-03 10:39:55 UTC
I believe this patch solves the issue:

https://www.redhat.com/archives/libvir-list/2013-December/msg00149.html

Comment 21 yanbing du 2013-12-13 07:09:48 UTC
Thanks for @Michal's mail.
Now, for libvirt-1.1.1-14.el7.x86_64, the logs look like good, it this OK to move this bug to VERIFIED?
...
2013-12-13 07:03:51.031+0000: 11933: debug : virObjectNew:199 : OBJECT_NEW: obj=0x7f0f340b8d50 classname=virNetcfDriverState
2013-12-13 07:03:51.456+0000: 11933: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.457+0000: 11933: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.522+0000: 11933: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.522+0000: 11933: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.522+0000: 11933: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.523+0000: 11933: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.523+0000: 11933: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.524+0000: 11933: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f0f340b8d50
...

Comment 22 Michal Privoznik 2013-12-13 08:05:03 UTC
I rather see whole log before confirming for sure. The key point here is that an object is not used *after* it has been disposed. That is:

2013-12-13 07:03:51.031+0000: 11933: debug : virObjectNew:199 : OBJECT_NEW: obj=0x7f0f340b8d50 classname=virNetcfDriverState

This creates a new object. Then we see a couple of refs and unrefs:

2013-12-13 07:03:51.456+0000: 11933: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f0f340b8d50
2013-12-13 07:03:51.457+0000: 11933: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f0f340b8d50

So far so good. But what's missing is the last part, where the object is disposed, something like

2013-12-13 07:03:51.457+0000: 11933: debug : virObjectUnref:256 : OBJECT_DISPOSE: obj=0x7f0f340b8d50

and then we must not see any REF on obj=0x7f0f340b8d50.

Comment 23 yanbing du 2013-12-13 08:29:24 UTC
(In reply to Michal Privoznik from comment #22)
> I rather see whole log before confirming for sure. The key point here is
> that an object is not used *after* it has been disposed. That is:
> 
> 2013-12-13 07:03:51.031+0000: 11933: debug : virObjectNew:199 : OBJECT_NEW:
> obj=0x7f0f340b8d50 classname=virNetcfDriverState
> 
> This creates a new object. Then we see a couple of refs and unrefs:
> 
> 2013-12-13 07:03:51.456+0000: 11933: debug : virObjectRef:293 : OBJECT_REF:
> obj=0x7f0f340b8d50
> 2013-12-13 07:03:51.457+0000: 11933: debug : virObjectUnref:256 :
> OBJECT_UNREF: obj=0x7f0f340b8d50
> 
> So far so good. But what's missing is the last part, where the object is
> disposed, something like

Yes, i didn't restart ibvirtd before collect the logs, so missed the last part.
Now the logs look like:
2013-12-13 08:23:52.830+0000: 12158: debug : virObjectNew:199 : OBJECT_NEW: obj=0x7f22780b8990 classname=virNetcfDriverState
2013-12-13 08:23:53.271+0000: 12158: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f22780b8990
2013-12-13 08:23:53.272+0000: 12158: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f22780b8990
2013-12-13 08:23:53.330+0000: 12158: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f22780b8990
2013-12-13 08:23:53.331+0000: 12158: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f22780b8990
2013-12-13 08:23:53.332+0000: 12158: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f22780b8990
2013-12-13 08:23:53.332+0000: 12158: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f22780b8990
2013-12-13 08:23:53.333+0000: 12158: debug : virObjectRef:293 : OBJECT_REF: obj=0x7f22780b8990
2013-12-13 08:23:53.333+0000: 12158: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f22780b8990
2013-12-13 08:24:25.768+0000: 12147: debug : virObjectUnref:256 : OBJECT_UNREF: obj=0x7f22780b8990
2013-12-13 08:24:25.768+0000: 12147: debug : virObjectUnref:258 : OBJECT_DISPOSE: obj=0x7f22780b8990


After OBJECT_DISPOSE, there no ref to the object.


> 
> 2013-12-13 07:03:51.457+0000: 11933: debug : virObjectUnref:256 :
> OBJECT_DISPOSE: obj=0x7f0f340b8d50
> 
> and then we must not see any REF on obj=0x7f0f340b8d50.

Comment 24 Michal Privoznik 2013-12-13 08:35:33 UTC
(In reply to yanbing du from comment #23)

> 
> After OBJECT_DISPOSE, there no ref to the object.
> 

Brilliant! The bug is fixed and can be moved to VERIFIED.

Comment 25 yanbing du 2013-12-16 02:41:12 UTC
(In reply to Michal Privoznik from comment #24)
> (In reply to yanbing du from comment #23)
> 
> > 
> > After OBJECT_DISPOSE, there no ref to the object.
> > 
> 
> Brilliant! The bug is fixed and can be moved to VERIFIED.

Thanks, move bug to VERIFIED.

Comment 26 Ludek Smid 2014-06-13 12:15:11 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Comment 28 Pavel Hrdina 2014-11-05 12:06:34 UTC
*** Bug 966449 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.