Bug 864336

Summary: [LXC] destroy domain will hang after restart libvirtd
Product: Red Hat Enterprise Linux 6 Reporter: Wayne Sun <gsun>
Component: libvirtAssignee: Daniel Berrangé <berrange>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.4CC: acathrow, ajia, dallan, dyasny, dyuan, eblake, mzhan, rwu, zhwang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.10.2-3.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 07:25:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
backtrace after destroy hang none

Description Wayne Sun 2012-10-09 07:50:37 UTC
Description of problem:
Start a lxc domain then restart libvirtd, destroy domain will hang.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-2.el6.x86_64
kernel-2.6.32-306.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare the domain xml
# virsh -c lxc:/// dumpxml vm1
<domain type='lxc'>
  <name>vm1</name>
  <uuid>386f5b25-43ee-9d62-4ce2-58c3809e47c1</uuid>
  <memory unit='KiB'>500000</memory>
  <currentMemory unit='KiB'>500000</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64'>exe</type>
    <init>/bin/sh</init>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/libvirt_lxc</emulator>
    <interface type='network'>
      <mac address='52:54:00:f2:2c:ac'/>
      <source network='default'/>
      <target dev='veth0'/>
    </interface>
    <console type='pty'>
      <target type='lxc' port='0'/>
    </console>
  </devices>
</domain>

2. start lxc guest
# virsh -c lxc:/// start vm1
Domain vm1 started

# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 27184 vm1                            running
 -     fedora-rawhide                 shut off

3. restart libvirtd
# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

4. destroy lxc guest
# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 27184 vm1                            running
 -     fedora-rawhide                 shut off

# virsh -c lxc:/// destroy vm1


destroy will hang here


This happens on both application and os domain.

Actual results:
destroy will hang after restart libvirtd

Expected results:
domain can be destroyed with no hang.

Additional info:
This works on libvirt-0.9.10-21.el6.x86_64.rpm

Comment 3 Daniel Berrangé 2012-10-09 09:49:22 UTC
Most likely libvirtd itself has hung here. Can you obtain a stack trace of libvirtd using gdb "thread apply all backtrace"

Comment 4 Daniel Berrangé 2012-10-09 09:49:44 UTC
It is possible it will be fixed by this patch

commit dd0371764f90f31fa8e596b40c0269cdbd5082f6
Author: Daniel P. Berrange <berrange>
Date:   Mon Sep 24 15:13:10 2012 +0100

    Remove pointless virLXCProcessMonitorDestroy method
    
    Asynchronously setting priv->mon to NULL was pointless,
    just remove the destroy callback entirely.
    
    Signed-off-by: Daniel P. Berrange <berrange>

Comment 5 Wayne Sun 2012-10-09 10:22:24 UTC
Created attachment 624016 [details]
backtrace after destroy hang

The backtrace is attached.
libvirtd might not hang, since # virsh list still working in other console.

Comment 6 Wayne Sun 2012-10-09 10:38:37 UTC
(In reply to comment #5)
> Created attachment 624016 [details]
> backtrace after destroy hang
> 
> The backtrace is attached.
> libvirtd might not hang, since # virsh list still working in other console.

in another console list qemu domain will success
# virsh list
 Id    Name                           State
----------------------------------------------------


in another console list lxc domain will hang
[root@amd-1216-8-2 ~]# virsh -c lxc:/// list

Comment 7 Daniel Berrangé 2012-10-09 10:42:21 UTC
Ok, the stack trace confirms that the problem is the same as the one I mentioned in that upstream patch.


Thread 8 (Thread 0x7f9df78cb700 (LWP 31177)):
#0  0x0000003b8fa0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003b8fa09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003b8fa09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000004c870c in virLXCProcessMonitorDestroy (mon=0x7f9dec008860, vm=0x7f9dec0c06c0) at lxc/lxc_process.c:564
#4  0x00000000004cfcb3 in virLXCMonitorFree (mon=0x7f9dec008860) at lxc/lxc_monitor.c:173
#5  0x00000000004cfd11 in virLXCMonitorUnref (mon=0x7f9dec008860) at lxc/lxc_monitor.c:192
#6  0x00000000004c8d42 in virLXCProcessCleanup (driver=0x7f9dec0c07c0, vm=0x7f9dec0c06c0, reason=VIR_DOMAIN_SHUTOFF_DESTROYED)
    at lxc/lxc_process.c:242
#7  virLXCProcessStop (driver=0x7f9dec0c07c0, vm=0x7f9dec0c06c0, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:736
#8  0x00000000004c6e94 in lxcDomainDestroyFlags (dom=0x7f9dec0c9a00, flags=<value optimized out>) at lxc/lxc_driver.c:1327
---Type <return> to continue, or q <return> to quit---
#9  0x00007f9dff66e7d0 in virDomainDestroy (domain=0x7f9dec0c9a00) at libvirt.c:2190
#10 0x000000000043f532 in remoteDispatchDomainDestroy (server=<value optimized out>, client=<value optimized out>, 
    msg=<value optimized out>, rerr=0x7f9df78cab80, args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1277
#11 remoteDispatchDomainDestroyHelper (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, 
    rerr=0x7f9df78cab80, args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1255
#12 0x00007f9dff6b9cc2 in virNetServerProgramDispatchCall (prog=0x24c0010, server=0x24b4e40, client=0x24bd3b0, msg=0x24bb000)
    at rpc/virnetserverprogram.c:431
#13 virNetServerProgramDispatch (prog=0x24c0010, server=0x24b4e40, client=0x24bd3b0, msg=0x24bb000) at rpc/virnetserverprogram.c:304
#14 0x00007f9dff6b84fe in virNetServerProcessMsg (srv=<value optimized out>, client=0x24bd3b0, prog=<value optimized out>, 
    msg=0x24bb000) at rpc/virnetserver.c:170
#15 0x00007f9dff6b8b9c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x24b4e40) at rpc/virnetserver.c:191
#16 0x00007f9dff5ddffc in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144
#17 0x00007f9dff5dd8e9 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161
#18 0x0000003b8fa07851 in start_thread () from /lib64/libpthread.so.0
#19 0x0000003b8f6e767d in clone () from /lib64/libc.so.6

Comment 9 Daniel Berrangé 2012-10-10 14:38:43 UTC
*** Bug 863931 has been marked as a duplicate of this bug. ***

Comment 13 Alex Jia 2012-10-16 02:04:40 UTC
Sometimes, a running lxc guest will be stopped when restart libvirtd service, it will block to verify this bug, but it's not alway reproducible, of course, it should be a new bug.

 # virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 12764 toy                            running
 -     hello                          shut off

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 -     hello                          shut off
 -     toy                            shut off


Notes, if everything is okay, I can successfully destroy lxc guest without hanging.

# rpm -q libvirt
libvirt-0.10.2-3.el6.x86_64

Comment 14 Alex Jia 2012-10-16 06:20:48 UTC
If everything is okay, I can successfully destroy lxc guest without hanging on 
libvirt-0.10.2-3.el6.x86_64, so move the bug to VERIFIED status.

# virsh -c lxc:// list
 Id    Name                           State
----------------------------------------------------
 15098 toy                            running

# virsh -c lxc:// destroy toy
Domain toy destroyed

# virsh -c lxc:// list --all
 Id    Name                           State
----------------------------------------------------
 -     hello                          shut off
 -     instance-0000004a              shut off
 -     toy                            shut off


# virsh -c lxc:// dumpxml toy
<domain type='lxc'>
  <name>toy</name>
  <uuid>bb428983-cb9f-4702-0f8d-7d4e143d9aad</uuid>
  <memory unit='KiB'>500000</memory>
  <currentMemory unit='KiB'>500000</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64'>exe</type>
    <init>/bin/sh</init>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/libvirt_lxc</emulator>
    <console type='pty'>
      <target type='lxc' port='0'/>
    </console>
  </devices>
</domain>

Comment 15 errata-xmlrpc 2013-02-21 07:25:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html