Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 618663 - libvirtd daemon died (caused vdsm to die as well)
libvirtd daemon died (caused vdsm to die as well)
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.0
All Linux
low Severity medium
: rc
: ---
Assigned To: Daniel Veillard
Virtualization Bugs
: RHELNAK
Depends On:
Blocks: 581275
  Show dependency treegraph
 
Reported: 2010-07-27 09:54 EDT by Haim
Modified: 2014-01-12 19:46 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-18 10:59:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backtrace - libvirtd and vdsm (109.47 KB, text/plain)
2010-07-27 09:54 EDT, Haim
no flags Details

  None (edit)
Description Haim 2010-07-27 09:54:22 EDT
Created attachment 434698 [details]
backtrace - libvirtd and vdsm

Description of problem:

during usual operation of vm life cycle (nothing in particular) I have noticed that host went down (vdsm service), and when I tries to restart vdsm service, i'v noticed that libvirt process died, then I looked for a core dump, and it seems like libvirt died before vdsm. 

attach is the back-trace extracted by gdb. 

I don't have particular repro steps, but i will try to reproduce. 

libvirt-0.8.1-17.el6.x86_64
vdsm-4.9-10.el6.x86_64
qemu-kvm-0.12.1.2-2.97.el6.x86_64
2.6.32-44.el6.x86_64
Comment 2 Daniel Berrange 2010-07-27 10:04:41 EDT
Two problems here

 - The backtrace is only of VDSM, no backtrace for libvirtd, so we've no idea why libvirtd crashed.

 - Some debuginfo RPMs appear to be missing - it isn't resolving symbols in libvirt python. eg see the '??' here:

#3  0x0000003707275736 in malloc_printerr (action=3, str=0x3707343ae8 "munmap_chunk(): invalid pointer", ptr=<value optimized out>) at malloc.c:6283
        buf = "0000000002063894"
        cp = <value optimized out>
#4  0x00007f1990636433 in ?? () from /usr/lib64/python2.6/site-packages/libvirtmod.so
No symbol table info available.
#5  0x0000003709edeb01 in call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:3750


Can you make sure libvirt-debuginfo is installed when generating the backtrace of VDSM. Also can you provide a backtrace for libvirtd
Comment 3 RHEL Product and Program Management 2010-07-27 10:17:56 EDT
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
Comment 4 Haim 2010-07-27 10:38:41 EDT
reproduced once again:

1) running with 3 hosts (several guests on them)
2) access to storage server via ssh and add iptable rule to block communication 
   to host that runs SPM (only - storage pool manager, vdsm term for logical role 
   that owns by one of the hosts, that manages all the actions regarding the 
   storage to prevent data corruption). 
3) vdsm process and libvirt dies 40 seconds after. 

i will attach gdb to libvirt process and provide more info
Comment 5 Haim 2010-07-27 10:53:18 EDT
reproduced once again:

1) running with 3 hosts (several guests on them)
2) access to storage server via ssh and add iptable rule to block communication 
   to host that runs SPM (only - storage pool manager, vdsm term for logical role 
   that owns by one of the hosts, that manages all the actions regarding the 
   storage to prevent data corruption). 
3) vdsm process and libvirt dies 40 seconds after. 

i will attach gdb to libvirt process and provide more info
Comment 6 Daniel Berrange 2010-07-27 10:57:10 EDT
> 2) access to storage server via ssh and add iptable rule to block communication 
>    to host that runs SPM (only - storage pool manager, vdsm term for logical role
>   that owns by one of the hosts, that manages all the actions regarding the 
>   storage to prevent data corruption). 

So IIUC, you are fencing block I/O from the guests. This should be generating blk IO error notifications from qemu to libvirt to vdsm.
Comment 9 RHEL Product and Program Management 2010-08-18 17:18:51 EDT
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.
Comment 10 Dave Allan 2010-11-08 17:04:02 EST
Haim, have you seen this crash again recently?
Comment 11 Haim 2010-11-18 10:59:32 EST
no. it's hard to reproduce, and libvirt doesn't dump cores by default, so it's also hard to catch. in case i'll hit again i will reopen

Note You need to log in before you can comment on or make changes to this bug.