Bug 864336 - [LXC] destroy domain will hang after restart libvirtd
[LXC] destroy domain will hang after restart libvirtd
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.4
x86_64 Linux
high Severity medium
: rc
: ---
Assigned To: Daniel Berrange
Virtualization Bugs
: Regression
: 863931 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-09 03:50 EDT by Wayne Sun
Modified: 2013-02-21 02:25 EST (History)
9 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-3.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:25:55 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backtrace after destroy hang (8.23 KB, text/plain)
2012-10-09 06:22 EDT, Wayne Sun
no flags Details

  None (edit)
Description Wayne Sun 2012-10-09 03:50:37 EDT
Description of problem:
Start a lxc domain then restart libvirtd, destroy domain will hang.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-2.el6.x86_64
kernel-2.6.32-306.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare the domain xml
# virsh -c lxc:/// dumpxml vm1
<domain type='lxc'>
  <name>vm1</name>
  <uuid>386f5b25-43ee-9d62-4ce2-58c3809e47c1</uuid>
  <memory unit='KiB'>500000</memory>
  <currentMemory unit='KiB'>500000</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64'>exe</type>
    <init>/bin/sh</init>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/libvirt_lxc</emulator>
    <interface type='network'>
      <mac address='52:54:00:f2:2c:ac'/>
      <source network='default'/>
      <target dev='veth0'/>
    </interface>
    <console type='pty'>
      <target type='lxc' port='0'/>
    </console>
  </devices>
</domain>

2. start lxc guest
# virsh -c lxc:/// start vm1
Domain vm1 started

# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 27184 vm1                            running
 -     fedora-rawhide                 shut off

3. restart libvirtd
# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

4. destroy lxc guest
# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 27184 vm1                            running
 -     fedora-rawhide                 shut off

# virsh -c lxc:/// destroy vm1


destroy will hang here


This happens on both application and os domain.

Actual results:
destroy will hang after restart libvirtd

Expected results:
domain can be destroyed with no hang.

Additional info:
This works on libvirt-0.9.10-21.el6.x86_64.rpm
Comment 3 Daniel Berrange 2012-10-09 05:49:22 EDT
Most likely libvirtd itself has hung here. Can you obtain a stack trace of libvirtd using gdb "thread apply all backtrace"
Comment 4 Daniel Berrange 2012-10-09 05:49:44 EDT
It is possible it will be fixed by this patch

commit dd0371764f90f31fa8e596b40c0269cdbd5082f6
Author: Daniel P. Berrange <berrange@redhat.com>
Date:   Mon Sep 24 15:13:10 2012 +0100

    Remove pointless virLXCProcessMonitorDestroy method
    
    Asynchronously setting priv->mon to NULL was pointless,
    just remove the destroy callback entirely.
    
    Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Comment 5 Wayne Sun 2012-10-09 06:22:24 EDT
Created attachment 624016 [details]
backtrace after destroy hang

The backtrace is attached.
libvirtd might not hang, since # virsh list still working in other console.
Comment 6 Wayne Sun 2012-10-09 06:38:37 EDT
(In reply to comment #5)
> Created attachment 624016 [details]
> backtrace after destroy hang
> 
> The backtrace is attached.
> libvirtd might not hang, since # virsh list still working in other console.

in another console list qemu domain will success
# virsh list
 Id    Name                           State
----------------------------------------------------


in another console list lxc domain will hang
[root@amd-1216-8-2 ~]# virsh -c lxc:/// list
Comment 7 Daniel Berrange 2012-10-09 06:42:21 EDT
Ok, the stack trace confirms that the problem is the same as the one I mentioned in that upstream patch.


Thread 8 (Thread 0x7f9df78cb700 (LWP 31177)):
#0  0x0000003b8fa0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003b8fa09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003b8fa09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00000000004c870c in virLXCProcessMonitorDestroy (mon=0x7f9dec008860, vm=0x7f9dec0c06c0) at lxc/lxc_process.c:564
#4  0x00000000004cfcb3 in virLXCMonitorFree (mon=0x7f9dec008860) at lxc/lxc_monitor.c:173
#5  0x00000000004cfd11 in virLXCMonitorUnref (mon=0x7f9dec008860) at lxc/lxc_monitor.c:192
#6  0x00000000004c8d42 in virLXCProcessCleanup (driver=0x7f9dec0c07c0, vm=0x7f9dec0c06c0, reason=VIR_DOMAIN_SHUTOFF_DESTROYED)
    at lxc/lxc_process.c:242
#7  virLXCProcessStop (driver=0x7f9dec0c07c0, vm=0x7f9dec0c06c0, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:736
#8  0x00000000004c6e94 in lxcDomainDestroyFlags (dom=0x7f9dec0c9a00, flags=<value optimized out>) at lxc/lxc_driver.c:1327
---Type <return> to continue, or q <return> to quit---
#9  0x00007f9dff66e7d0 in virDomainDestroy (domain=0x7f9dec0c9a00) at libvirt.c:2190
#10 0x000000000043f532 in remoteDispatchDomainDestroy (server=<value optimized out>, client=<value optimized out>, 
    msg=<value optimized out>, rerr=0x7f9df78cab80, args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1277
#11 remoteDispatchDomainDestroyHelper (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, 
    rerr=0x7f9df78cab80, args=<value optimized out>, ret=<value optimized out>) at remote_dispatch.h:1255
#12 0x00007f9dff6b9cc2 in virNetServerProgramDispatchCall (prog=0x24c0010, server=0x24b4e40, client=0x24bd3b0, msg=0x24bb000)
    at rpc/virnetserverprogram.c:431
#13 virNetServerProgramDispatch (prog=0x24c0010, server=0x24b4e40, client=0x24bd3b0, msg=0x24bb000) at rpc/virnetserverprogram.c:304
#14 0x00007f9dff6b84fe in virNetServerProcessMsg (srv=<value optimized out>, client=0x24bd3b0, prog=<value optimized out>, 
    msg=0x24bb000) at rpc/virnetserver.c:170
#15 0x00007f9dff6b8b9c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x24b4e40) at rpc/virnetserver.c:191
#16 0x00007f9dff5ddffc in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144
#17 0x00007f9dff5dd8e9 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161
#18 0x0000003b8fa07851 in start_thread () from /lib64/libpthread.so.0
#19 0x0000003b8f6e767d in clone () from /lib64/libc.so.6
Comment 9 Daniel Berrange 2012-10-10 10:38:43 EDT
*** Bug 863931 has been marked as a duplicate of this bug. ***
Comment 13 Alex Jia 2012-10-15 22:04:40 EDT
Sometimes, a running lxc guest will be stopped when restart libvirtd service, it will block to verify this bug, but it's not alway reproducible, of course, it should be a new bug.

 # virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 12764 toy                            running
 -     hello                          shut off

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 -     hello                          shut off
 -     toy                            shut off


Notes, if everything is okay, I can successfully destroy lxc guest without hanging.

# rpm -q libvirt
libvirt-0.10.2-3.el6.x86_64
Comment 14 Alex Jia 2012-10-16 02:20:48 EDT
If everything is okay, I can successfully destroy lxc guest without hanging on 
libvirt-0.10.2-3.el6.x86_64, so move the bug to VERIFIED status.

# virsh -c lxc:// list
 Id    Name                           State
----------------------------------------------------
 15098 toy                            running

# virsh -c lxc:// destroy toy
Domain toy destroyed

# virsh -c lxc:// list --all
 Id    Name                           State
----------------------------------------------------
 -     hello                          shut off
 -     instance-0000004a              shut off
 -     toy                            shut off


# virsh -c lxc:// dumpxml toy
<domain type='lxc'>
  <name>toy</name>
  <uuid>bb428983-cb9f-4702-0f8d-7d4e143d9aad</uuid>
  <memory unit='KiB'>500000</memory>
  <currentMemory unit='KiB'>500000</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64'>exe</type>
    <init>/bin/sh</init>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/libvirt_lxc</emulator>
    <console type='pty'>
      <target type='lxc' port='0'/>
    </console>
  </devices>
</domain>
Comment 15 errata-xmlrpc 2013-02-21 02:25:55 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html

Note You need to log in before you can comment on or make changes to this bug.