RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 918959 - [abrt] libvirt-0.10.2-18.el6: _int_free: Process /usr/sbin/libvirtd was killed by signal 11 (SIGSEGV)
Summary: [abrt] libvirt-0.10.2-18.el6: _int_free: Process /usr/sbin/libvirtd was kille...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: John Ferlan
QA Contact: Virtualization Bugs
URL:
Whiteboard: abrt_hash:2fc968e737a27deb64b13469804...
Depends On:
Blocks: 835616 928309 960054
TreeView+ depends on / blocked
 
Reported: 2013-03-07 10:07 UTC by David Jaša
Modified: 2018-12-01 16:31 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-05-08 17:53:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
File: maps (25.64 KB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: var_log_messages (150 bytes, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: environ (266 bytes, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: dso_list (5.56 KB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: limits (1.29 KB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: sosreport.tar.xz (1.53 MB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: backtrace (59.79 KB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: build_ids (2.56 KB, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details
File: cgroup (88 bytes, text/plain)
2013-03-07 10:07 UTC, David Jaša
no flags Details

Description David Jaša 2013-03-07 10:07:24 UTC
Description of problem:
This crash occurred during back-and-forth migration of a VM (with another instance of the libvirt).

Version-Release number of selected component:
libvirt-0.10.2-18.el6

Additional info:
libreport version: 2.0.9
abrt_version:   2.0.8
backtrace_rating: 4
cmdline:        libvirtd --daemon --listen
crash_function: _int_free
kernel:         2.6.32-358.el6.x86_64

truncated backtrace:
:Thread no. 1 (7 frames)
: #0 _int_free at malloc.c
: #1 virFree at util/memory.c
: #2 virObjectUnref at util/virobject.c
: #3 virEventPollCleanupHandles at util/event_poll.c
: #4 virEventPollRunOnce at util/event_poll.c
: #5 virEventRunDefaultImpl at util/event.c
: #6 virNetServerRun at rpc/virnetserver.c

Comment 1 David Jaša 2013-03-07 10:07:28 UTC
Created attachment 706483 [details]
File: maps

Comment 2 David Jaša 2013-03-07 10:07:30 UTC
Created attachment 706484 [details]
File: var_log_messages

Comment 3 David Jaša 2013-03-07 10:07:32 UTC
Created attachment 706485 [details]
File: environ

Comment 4 David Jaša 2013-03-07 10:07:35 UTC
Created attachment 706486 [details]
File: dso_list

Comment 5 David Jaša 2013-03-07 10:07:37 UTC
Created attachment 706487 [details]
File: limits

Comment 6 David Jaša 2013-03-07 10:07:46 UTC
Created attachment 706488 [details]
File: sosreport.tar.xz

Comment 7 David Jaša 2013-03-07 10:07:49 UTC
Created attachment 706489 [details]
File: backtrace

Comment 8 David Jaša 2013-03-07 10:07:51 UTC
Created attachment 706490 [details]
File: build_ids

Comment 9 David Jaša 2013-03-07 10:07:54 UTC
Created attachment 706491 [details]
File: cgroup

Comment 14 Pavel Zhukov 2013-03-25 14:05:24 UTC
Is https://bugzilla.redhat.com/show_bug.cgi?id=924756 duplicate of this bug?

Comment 15 John Ferlan 2013-03-25 22:49:17 UTC
Just so you know - I am digging into this. It's a bit slow going as I am new at digging into RH/libvirtd problems.  I'm running under the assumption it's an error path type thing right now.

I know the case indicates the error occurred with back-n-forth migration; however, I'm curious if there was anything else being attempted?  I note in the messages output from the sos tarball that there's a series of "Listening on interface #xx" and "Deleting interface #xx" messages right around the crash (where xx = 10, 11, 12, 13, 14, 15, & 16).  

Around the time 13, 15, & 16 go through their iterations there are other segfaults listed in the output dealing with qemu-kvm and libspice-server.so.

The reason I note this is I have to "wonder" if this type of migration was working without error until only recently.  What caught my eye was the yum.log output indicating a recent change/update to spice-server and I'm wondering if there's a relationship between the two. I'm not pointing fingers, but just trying to glean some more data.  In particular if this was working well previously and libvirt didn't change, then what other environmental factor could caused a failure.

Comment 16 Pavel Zhukov 2013-03-26 07:52:34 UTC
(In reply to comment #15)

> The reason I note this is I have to "wonder" if this type of migration was
> working without error until only recently.  What caught my eye was the
> yum.log output indicating a recent change/update to spice-server and I'm
> wondering if there's a relationship between the two. I'm not pointing
> fingers, but just trying to glean some more data.  In particular if this was
> working well previously and libvirt didn't change, then what other
> environmental factor could caused a failure.

John, It's RHEV-H system, You could not find any changes because there are not yum there as well as we cannot install customs RPM without hacks... 
FYI The problem case with "Red Hat Enterprise Virtualization Hypervisor release 6.4 (20130306.2.el6_4)" and bundled libvirt-0.10.2-18.el6.

Comment 17 Eric Blake 2013-03-26 20:19:07 UTC
I wonder if this upstream patch has any relation:
https://www.redhat.com/archives/libvir-list/2013-March/msg01489.html

Comment 18 Eric Blake 2013-03-26 20:28:28 UTC
Another one worth looking at (still needs upstream review as I type this comment):
https://www.redhat.com/archives/libvir-list/2013-March/msg01469.html

Comment 21 Eric Blake 2013-04-02 21:31:16 UTC
(In reply to comment #0)
> Description of problem:
> This crash occurred during back-and-forth migration of a VM (with another
> instance of the libvirt).

Was this using peer-to-peer migration? If so, then I'm pretty sure this patch series explains the problem:

https://www.redhat.com/archives/libvir-list/2013-March/msg01682.html

> 
> truncated backtrace:
> :Thread no. 1 (7 frames)
> : #0 _int_free at malloc.c
> : #1 virFree at util/memory.c
> : #2 virObjectUnref at util/virobject.c
> : #3 virEventPollCleanupHandles at util/event_poll.c

At any rate, this portion of the stack trace is consistent with trying to free through a pointer deleted in another thread.

Comment 22 Eric Blake 2013-04-08 22:19:03 UTC
Peter's patches to fix the close callback race solve a problem introduced in upstream 0.10.0, and therefore present in RHEL 6.4 (based on upstream 0.10.2) but not 6.3 (based on upstream 0.9.10):
https://www.redhat.com/archives/libvir-list/2013-April/msg00672.html
As such, I'm adding the regression flag.

Comment 23 Peter Krempa 2013-04-09 10:20:44 UTC
A scratch build containing fixes that are believed to fix this problem is available at:

https://brewweb.devel.redhat.com/taskinfo?taskID=5610687

Comment 25 David Jaša 2013-04-09 19:07:04 UTC
(In reply to comment #21)
> (In reply to comment #0)
> > Description of problem:
> > This crash occurred during back-and-forth migration of a VM (with another
> > instance of the libvirt).
> 
> Was this using peer-to-peer migration?

If peer-to-peer migration is result of commands like these:
virsh -c qemu+tcp://source_host/system migrate --live VM_NAME qemu+tcp://dest_host/system

then yes, it was peer-to-peer migration.

I hit the bug just once though so I'm not able to tell decisively that the bug is fixed for me.

Comment 26 Eric Blake 2013-04-09 19:22:25 UTC
peer-to-peer migration involves the --p2p flag of 'virsh migrate'.  But the command line you used omitted --p2p, so it was direct.  http://libvirt.org/migration.html shows the difference - in direct migration, libvirt.so is the client to two different libvirtd processes; in peer-to-peer migration, libvirt.so is the client to only one libvirtd process, and that libvirtd is in turn client to another libvirtd.

Peter's patches (comment 23) had to do with a crash in the client; that would explain the source libvirtd crashing on a peer-to-peer migration (since the source is a client to the destination), but would not explain you seeing a crash in libvirtd with direct migration (there, you would expect virsh to die as the client to either source or destination, but not for libvirtd to die).  See also bug 911609.

I'm still looking for other potential races, where the race would affect the server rather than the client, to match with your report of libvirtd crashing on a direct migration.

Comment 27 Eric Blake 2013-04-10 02:04:30 UTC
bug 915353 describes a crash on shutdown; it was fixed for libvirt-0.10.2-18.el6_4.1 - I'm starting to think that this particular fix is the one that solves the problem at hand.

Comment 28 Eric Blake 2013-04-10 02:21:54 UTC
Another possible cause is a crash on auto-destroy, bug 950286.  Migration uses auto-destroy on the destination until the source is far enough along in the migration process, where a bug there could crash libvirtd.

Comment 29 John Ferlan 2013-05-08 17:53:35 UTC
Since this problem was not easily reproduced and there is a patch available that resolves similarly described problems, I was asked to close this bug as insufficient data with a reference to the available patch.

If after installing the updates described here:

http://rhn.redhat.com/errata/RHBA-2013-0756.html

the problem still occurs, then feel free to reopen this case or open a new problem.


Note You need to log in before you can comment on or make changes to this bug.