Bug 1136788

Summary: libvirtd segfault while starting a guest with <on_lockfailure> as ignore
Product: Red Hat Enterprise Linux 6 Reporter: Xuesong Zhang <xuzhang>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.6CC: bsanford, danken, dyuan, lsu, mprivozn, myakove, mzhan, pzhang, rbalakri, shyu, tpelka, yanyang
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.2-47.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 05:47:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Xuesong Zhang 2014-09-03 10:06:46 UTC
Description of problem:
Edit one guest, specify <on_lockfailure> as ignore, then start the guest, libvirtd will segfault but not dead.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-45.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.443.el6.x86_64
kernel-2.6.32-500.el6.x86_64
glibc-2.12-1.149.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. configure the sanlock env.
#tail -1 /etc/lib/libvirt/qemu.conf
lock_manager = "sanlock"

#tail -6 /etc/lib/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1
require_lease_for_disks = 1
user = "sanlock"
group = "sanlock"

#setsebool virt_use_sanlock 1

2.Service start
#service wdmd start
#service sanlock start
#service libvirtd restart

3. edit the guest dumpxml, add the following element to the guest.
......
<on_lockfailure>ignore</on_lockfailure>
......

4. start the guest
virsh start test
error: Failed to start domain test
error: Child quit during startup handshake: Input/output error


Actual results:
libvirtd segfault but not dead.

Expected results:
libvirtd should be working well, no segfault.


Additional info:
1. whlie specify the <on_lockfailure> value as poweroff, restart or pause, the guest can be started successfully.
2. bt info of the segfault:
(gdb) t a a bt

Thread 1 (Thread 0x7f550c01a700 (LWP 12424)):
#0  __libc_free (mem=0x4) at malloc.c:3714
#1  0x00007f5513324dd9 in virFree (ptrptr=0x7f550c018850) at util/memory.c:419
#2  0x00007f5505206687 in virLockManagerSanlockAcquire (lock=<value optimized out>, state=<value optimized out>, flags=<value optimized out>, 
    action=<value optimized out>, fd=0x7f550c01897c) at locking/lock_driver_sanlock.c:1051
#3  0x00007f55133da72b in virDomainLockProcessStart (plugin=<value optimized out>, uri=<value optimized out>, dom=0x7f5500107970, paused=true, 
    fd=0x7f550c01897c) at locking/domain_lock.c:178
#4  0x00000000004b8c82 in qemuProcessHook (data=0x7f550c019370) at qemu/qemu_process.c:2851
#5  0x00007f5513313bc3 in virCommandHook (data=0x7f54d8002c20) at util/command.c:2088
#6  0x00007f5513315c8d in virExecWithHook (argv=0x7f54d8003100, envp=0x7f54d8002580, keepfd=<value optimized out>, 
    keepfd_size=<value optimized out>, retpid=<value optimized out>, infd=30, outfd=0x7f550c0193bc, errfd=0x7f550c0193bc, flags=7, 
    data=0x7f54d8002c20, pidfile=0x7f54d800b080 "/var/run/libvirt/qemu/test.pid", capabilities=0, maxMemLock=0, maxProcesses=0, maxFiles=0, 
    hook=0x7f5513313b60 <virCommandHook>) at util/command.c:630
#7  0x00007f551331652f in virCommandRunAsync (cmd=0x7f54d8002c20, pid=0x0) at util/command.c:2232
#8  0x00007f5513316969 in virCommandRun (cmd=0x7f54d8002c20, exitstatus=0x0) at util/command.c:2018
#9  0x00000000004bb22e in qemuProcessStart (conn=0x7f54e8000b60, driver=0x7f550000e8c0, vm=0x7f5500107970, migrateFrom=0x0, stdin_fd=-1, 
    stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:4081
#10 0x000000000046c83e in qemuDomainObjStart (conn=0x7f54e8000b60, driver=0x7f550000e8c0, vm=0x7f5500107970, flags=<value optimized out>)
    at qemu/qemu_driver.c:6124
#11 0x000000000046ce72 in qemuDomainStartWithFlags (dom=0x7f54d80008c0, flags=0) at qemu/qemu_driver.c:6181
#12 0x00007f55133c4fc0 in virDomainCreate (domain=0x7f54d80008c0) at libvirt.c:8385
#13 0x00000000004409c2 in remoteDispatchDomainCreate (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, 
    rerr=0x7f550c019b80, args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1066
#14 remoteDispatchDomainCreateHelper (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, rerr=0x7f550c019b80, 
    args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1044
#15 0x00007f551340ecf2 in virNetServerProgramDispatchCall (prog=0x1692480, server=0x1689a30, client=0x1686210, msg=0x168eed0)
    at rpc/virnetserverprogram.c:431
#16 virNetServerProgramDispatch (prog=0x1692480, server=0x1689a30, client=0x1686210, msg=0x168eed0) at rpc/virnetserverprogram.c:304
#17 0x00007f551341124e in virNetServerProcessMsg (srv=<value optimized out>, client=0x1686210, prog=<value optimized out>, msg=0x168eed0)
    at rpc/virnetserver.c:170
#18 0x00007f55134118ec in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x1689a30) at rpc/virnetserver.c:191
#19 0x00007f551332fb3c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144
#20 0x00007f551332f429 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161
---Type <return> to continue, or q <return> to quit---
#21 0x00000032fa6079d1 in start_thread (arg=0x7f550c01a700) at pthread_create.c:301
#22 0x00000032fa2e8ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Comment 1 Jiri Denemark 2014-09-03 11:29:10 UTC
This just makes libvirt to report a different (and unhelpful) error message when starting such domain. The crash is a bug in failure path after encountering unsupported "ignore" action.

Comment 2 Jiri Denemark 2014-09-03 13:25:45 UTC
Fixed upstream by v1.2.8-7-g760cf5d:

commit 760cf5d30e44803268a7103175056a3203d52527
Author: Jiri Denemark <jdenemar>
Date:   Wed Sep 3 13:18:59 2014 +0200

    sanlock: Avoid freeing uninitialized value
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1136788
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 3 Michal Privoznik 2014-10-08 12:27:53 UTC
*** Bug 1148467 has been marked as a duplicate of this bug. ***

Comment 11 Pei Zhang 2015-01-04 05:53:10 UTC
verify version:
libvirt-0.10.2-47.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.449.el6.x86_64
kernel-2.6.32-504.el6.x86_64

steps:
1. configure the sanlock env.
#tail -1 /etc/lib/libvirt/qemu.conf
lock_manager = "sanlock"

#tail -6 /etc/lib/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1
require_lease_for_disks = 1
user = "sanlock"
group = "sanlock"

#setsebool virt_use_sanlock on

2.Service start
#service wdmd start
#service sanlock start
#service libvirtd restart

3. add on_lockfailure ignore element  to domain XML

# virsh dumpxml r6-rls | grep on_lockfailure
......
  <on_lockfailure>ignore</on_lockfailure>
......

4.try to start the guest .

# virsh start r6-rls
error: Failed to start domain r6-rls
error: internal error unsupported configuration: Failure action ignore is not supported by sanlock

5. check libvirtd status , libvirtd still running 

# service libvirtd status
libvirtd (pid  27623) is running...

libvirt report a clear error message  and still running . move to verified.

Comment 13 errata-xmlrpc 2015-07-22 05:47:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1252.html