Bug 747912

Summary: 'libvirtd --daemon' - Program terminated with signal 11, Segmentation fault.
Product: Red Hat Enterprise Linux 5 Reporter: Stanislav Graf <sgraf>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.7CC: dahorak, dallan, mzhan, rwu, ydu, yupzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-28 01:25:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
detailed threaded backtrace none

Description Stanislav Graf 2011-10-21 11:21:05 UTC
Created attachment 529485 [details]
detailed threaded backtrace

Description of problem:

I'm destroying VM, then installing new one (with create) and after this new machine is off and disappears, libvirtd stops.

virsh --connect qemu+ssh://remote-machine destroy "${installed_pc}"
virsh --connect qemu+ssh://remote-machine create "~/${installed_pc}-install.xml"
virsh --connect qemu+ssh://remote-machine domstate "${installed_pc}"-install
virsh --connect qemu+ssh://remote-machine start "${installed_pc}"

Version-Release number of selected component (if applicable):
libvirt-0.8.2-22.el5

How reproducible:
20% (depends on how many machines are created)

Steps to Reproduce:
1. virsh --connect qemu+ssh://remote-machine destroy "${installed_pc}"
2. virsh --connect qemu+ssh://remote-machine create "~/${installed_pc}-install.xml"
3. virsh --connect qemu+ssh://remote-machine domstate "${installed_pc}"-install
4. virsh --connect qemu+ssh://remote-machine start "${installed_pc}"
  
Actual results:
libvirtd coredumped

Expected results:
libvirtd continues serving

Additional info:

Comment 1 Laine Stump 2011-10-27 17:07:02 UTC
0) are steps 1-4 run in a script, or by hand? Where exactly in this sequence is the crash occurring? From the description it sounds like it's crashing after the guest is destroyed, but from the backtrace it looks like libvirtd is crashing during the "virsh domstate".

1) Normally, virsh create will bring up a running guest, so "virsh start" should return "error: Domain is already active". Can you explain what you were trying to do here?


2) Is anything else running that might be calling libvirt on remote-machine? In particular, is virt-manager running?

3) When you say "depends on how many machines are created", are you referring to how many guests are active at the moment? Or are you just saying that when you sequentially create-destroy-create-destroy sequentially (never more than a single guest active at a time) eventually this crash will occur?

4) For completeness' sake can you include the guest XML?

5) Does the same thing happen if you run virsh locally on "remote-machine" rather than doing it remotely?

At first glance this seems to be a problem of incorrect refcounting of the domain object, which has had problems wrt transient domains. It will take some digging to find patches that corrected such problems and determine whether or not those patches are applicable to libvirt-0.8.2, but being able to easily reproduce will help narrow things down.

Comment 2 Laine Stump 2011-10-27 21:20:26 UTC
I see many similarities between this and Bug 670848 (which was reported against RHEL6 / libvirt-0.8.7). It appears that all three of the patches in that bug are also relevant to RHEL5 / libvirt-0.8.2, and there may be others (see Comment #18  of Bug 670848). This is all very tricky/delicate code though, so we need to be very careful about what we take in, to avoid unexpected regressions.

Comment 4 RHEL Program Management 2011-10-28 01:25:11 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 5 Stanislav Graf 2011-11-24 10:10:30 UTC
We partly compensated this bug by small watchdog script:

# cat bin/watchdog_libvirtd.sh 

#!/usr/bin/env bash

while true; do 
	service libvirtd status &> /dev/null || service libvirtd restart &> /dev/null; 
	sleep 5; 
done
#eof