Bug 604585 - [vdsm] [libvirt intg]Libvirtd dies and after a while sending an abort signal
[vdsm] [libvirt intg]Libvirtd dies and after a while sending an abort signal
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.0
All Linux
high Severity urgent
: rc
: ---
Assigned To: Daniel Berrange
Virtualization Bugs
vdsm & libvirt integration
:
Depends On:
Blocks: 581275 599016 Rhel6.0LibvirtTier2
  Show dependency treegraph
 
Reported: 2010-06-16 06:01 EDT by Moran Goldboim
Modified: 2010-11-11 09:47 EST (History)
6 users (show)

See Also:
Fixed In Version: libvirt-0_8_1-14_el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-11 09:47:48 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
core dump (2.05 MB, application/octet-stream)
2010-06-16 06:01 EDT, Moran Goldboim
no flags Details
gdb backtrace (65.94 KB, text/plain)
2010-06-16 06:02 EDT, Moran Goldboim
no flags Details
libvirt log (46.96 KB, application/gzip)
2010-06-22 05:14 EDT, Moran Goldboim
no flags Details
backtrace (614.95 KB, application/octet-stream)
2010-06-29 08:18 EDT, Moran Goldboim
no flags Details
backtrace full (49.97 KB, text/plain)
2010-07-11 07:50 EDT, Moran Goldboim
no flags Details

  None (edit)
Description Moran Goldboim 2010-06-16 06:01:39 EDT
Created attachment 424390 [details]
core dump

Description of problem:
Libvirtd dies and after a while sending an abort event, after getting: 04:37:03.691: debug : virEventInterruptLocked:668 : Interrupting
[root@gold-vdsa ~]# service libvirtd status
libvirtd dead but pid file exists
it kills vdsm 24 minutes after.

Version-Release number of selected component (if applicable):
libvirt-0.8.1-8.el6.x86_64
vdsm-4.9-8.el6.x86_64

How reproducible:
happened once

Steps to Reproduce:
1.happened during Rhevm vm sanity automation tests suite but i don't see a correlation 
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Moran Goldboim 2010-06-16 06:02:32 EDT
Created attachment 424391 [details]
gdb backtrace
Comment 3 RHEL Product and Program Management 2010-06-16 06:13:17 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 4 Daniel Berrange 2010-06-16 07:44:44 EDT
The stack trace & core dumps attached here are both from the VDSM python process, not libvirtd. 

Can you provide a stack trace from libvirtd itself. Attach to libvirtd with GDB while it is still running normally, and then do whatever you need to make it crash, and capture a dump with 'thread apply all backtrace'.  NB, make sure the libvirt-debuginfo is installed before running GDB.

Can you also edit libvirtd.conf and set

  log_filters="1:qemu 1:util 1:security"
  log_outputs="1:file:/var/log/libvirt/libvirtd.log"

and attach the libvirtd.log file after the crash occurs.

Finally,can you elaborate on what you mean by "libvirtd sending an abort event" ?
Comment 5 Moran Goldboim 2010-06-16 08:38:07 EDT
I meant Abort signal - will try to reproduce it again
Comment 6 Dan Kenigsberg 2010-06-21 03:54:40 EDT
Here's the tail of libvird.log after vdsm died (with abort signal while in libvirt_virEventRegisterImpl) and libvirtd died (I don't know why):

09:16:27.461: debug : qemuMonitorJSONIOProcessEvent:99 : handle BLOCK_IO_ERROR handler=0x478140 data=0x7fdce5c1a9d0
09:16:27.461: warning : qemuMonitorJSONHandleIOError:597 : missing reason in disk io error event
09:16:27.461: debug : qemuMonitorEmitIOError:849 : mon=0x7fdd0c0109f0
09:16:27.461: debug : qemuMonitorJSONIOProcess:188 : Total used 204 bytes out of 204 available in buffer
09:16:27.462: debug : qemuMonitorJSONIOProcessLine:115 : Line [{"timestamp": {"seconds": 1277100987, "microseconds": 462251}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-ide0-0-0", "__com.redhat_reason": "eother", "operation": "write", "action": "report"}}]
09:16:27.462: debug : virJSONValueFromString:901 : string={"timestamp": {"seconds": 1277100987, "microseconds": 462251}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-ide0-0-0", "__com.redhat_reason": "eother", "operation": "write", "action": "report"}}
09:16:27.462: debug : virJSONParserHandleStartMap:783 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b63062
09:16:27.462: debug : virJSONParserHandleStartMap:783 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b63070
09:16:27.462: debug : virJSONParserHandleNumber:723 : parser=0x7fdd1a159660 str=(null)
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b63087
09:16:27.462: debug : virJSONParserHandleNumber:723 : parser=0x7fdd1a159660 str=(null)
09:16:27.462: debug : virJSONParserHandleEndMap:810 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b630a0
09:16:27.462: debug : virJSONParserHandleString:744 : parser=0x7fdd1a159660 str=0x7fdce5b630a9
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b630bb
09:16:27.462: debug : virJSONParserHandleStartMap:783 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b630c4
09:16:27.462: debug : virJSONParserHandleString:744 : parser=0x7fdd1a159660 str=0x7fdce5b630ce
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b630e0
09:16:27.462: debug : virJSONParserHandleString:744 : parser=0x7fdd1a159660 str=0x7fdce5b630f7
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b63101
09:16:27.462: debug : virJSONParserHandleString:744 : parser=0x7fdd1a159660 str=0x7fdce5b6310e
09:16:27.462: debug : virJSONParserHandleMapKey:764 : parser=0x7fdd1a159660 key=0x7fdce5b63117
09:16:27.462: debug : virJSONParserHandleString:744 : parser=0x7fdd1a159660 str=0x7fdce5b63121
09:16:27.462: debug : virJSONParserHandleEndMap:810 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONParserHandleEndMap:810 : parser=0x7fdd1a159660
09:16:27.462: debug : virJSONValueFromString:931 : result=0x7fdce5c1b350
09:16:27.462: debug : qemuMonitorJSONIOProcessEvent:86 : mon=0x7fdd040d0970 obj=0x7fdce5c1b350
09:16:27.462: debug : qemuMonitorJSONIOProcessEvent:99 : handle BLOCK_IO_ERROR handler=0x478140 data=0x7fdce5c1a1a0
09:16:27.462: warning : qemuMonitorJSONHandleIOError:597 : missing reason in disk io error event
09:16:27.462: debug : qemuMonitorEmitIOError:849 : mon=0x7fdd040d0970
09:16:27.462: debug : qemuMonitorJSONIOProcess:188 : Total used 204 bytes out of 204 available in buffer
09:16:27.463: debug : qemuMonitorIO:572 : Triggering EOF callback error? 1
09:16:27.463: debug : qemuHandleMonitorEOF:1106 : Received EOF on 0x7fdcfc021000 'VmStrApiTestVm'
09:16:27.463: debug : qemudShutdownVMDaemon:4120 : Shutting down VM 'VmStrApiTestVm' migrated=0
09:16:27.466: debug : qemuMonitorJSONCommandWithFd:222 : Receive command reply ret=-1 errno=104 0 bytes '(null)'
09:16:27.466: error : qemuMonitorJSONCommandWithFd:242 : cannot send monitor command '{"execute":"query-blockstats"}': Connection reset by peer
09:16:27.466: debug : qemuMonitorClose:682 : mon=0x7fdcfc024ea0
Comment 7 Moran Goldboim 2010-06-22 05:14:36 EDT
Created attachment 425872 [details]
libvirt log
Comment 8 Moran Goldboim 2010-06-22 05:15:14 EDT
In another instance of VDSM getting an abort signal- libvirt didn't died. happened right after a create domain was started on VDSM.
No indication of it had been seen in libvirtd log (on 16:28:54,520 with- log_filters="1:qemu 1:util 1:security"), only after vdsm was up again and several other create had happened afterward libvirt generated the following errors (log is attached):
16:36:40.763: error : qemudDomainBlockStats:9596 : Domain not found: no domain with matching uuid 'e41ced35-4019-4784-a97d-f29f1f4cbf06'
16:37:31.427: error : qemuMonitorOpenUnix:277 : monitor socket did not show up.: Connection refused
16:37:31.427: error : qemuConnectMonitor:1584 : Failed to connect monitor for VmUpApiTestVm
16:37:31.427: error : qemudWaitForMonitor:2515 : internal error process exited while connecting to monitor: do_spice_init: starting 0.4.2-10..el6
Comment 9 Daniel Berrange 2010-06-22 10:25:17 EDT
The log from comment #7 doesn't show any problem with libvirt dieing.

All that's seen is that a QEMU guest is failing to startup due to a SPICE error

16:37:28.421: debug : qemudStartVMDaemon:4017 : Waiting for monitor to show up
16:37:28.421: debug : qemudWaitForMonitor:2479 : Connect monitor to 0x7f01a000eb00 'VmUpApiTestVm'
16:37:31.427: error : qemuMonitorOpenUnix:277 : monitor socket did not show up.: Connection refused
16:37:31.427: debug : qemuMonitorClose:682 : mon=0x7f01a0003010
16:37:31.427: debug : qemuMonitorFree:200 : mon=0x7f01a0003010
16:37:31.427: error : qemuConnectMonitor:1584 : Failed to connect monitor for VmUpApiTestVm
16:37:31.427: error : qemudWaitForMonitor:2515 : internal error process exited while connecting to monitor: do_spice_init: starting 0.4.2-10..el6
do_spice_init: statistics shm_open failed, Permission denied
Comment 10 Daniel Berrange 2010-06-22 10:31:20 EDT
Were there any SELinux AVC errors after the shm_open() call failure ?  It is possible there might need to be policy additions if SPICE is using SHM
Comment 11 Moran Goldboim 2010-06-29 08:18:49 EDT
Created attachment 427661 [details]
backtrace

bug was recreated with selinux enabled - backtrace is attached
Comment 12 Daniel Berrange 2010-06-29 08:47:53 EDT
That was a core dump, rather than a stack trace. Unfortunately this is not any help because I can't reliably extract data from the coredump without having the exact same set of RPM versions of everything libvirt links to, as you do. Can you capture a stack trace by running this command in GDB:

 'thread apply all backtrace'
Comment 13 Moran Goldboim 2010-07-11 07:50:35 EDT
Created attachment 430987 [details]
backtrace full

added a full backtrace on all threads
Comment 14 Daniel Berrange 2010-07-12 09:32:27 EDT
Posted patch upstream

https://www.redhat.com/archives/libvir-list/2010-July/msg00214.html
Comment 16 Dave Allan 2010-07-14 02:16:06 EDT
libvirt-0_8_1-14_el6 has been built in RHEL-6-candidate with the fix.

Dave
Comment 18 Moran Goldboim 2010-08-22 06:06:23 EDT
Verified on: libvirt-0.8.1-24.el6.x86_64
Comment 19 releng-rhel@redhat.com 2010-11-11 09:47:48 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.