Bug 1188609 - Error from libvirt during destroy Code=38 - Device or resource busy
Summary: Error from libvirt during destroy Code=38 - Device or resource busy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 6.0 (Juno)
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 7.0 (Kilo)
Assignee: Kashyap Chamarthy
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-03 11:30 UTC by bkopilov
Modified: 2019-10-10 09:36 UTC (History)
15 users (show)

Fixed In Version: openstack-nova-2014.2.3-26.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-24 13:53:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs for nova /var/logs/ (10.79 MB, application/x-gzip)
2015-02-03 11:33 UTC, bkopilov
no flags Details
libvirt.log files [gz file] (2.30 MB, application/x-gzip)
2015-03-01 07:28 UTC, bkopilov
no flags Details
nova-compute.log [gz file] (730.66 KB, application/x-gzip)
2015-03-01 07:29 UTC, bkopilov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0507 0 normal SHIPPED_LIVE openstack-nova bug fix advisory 2016-03-24 17:52:57 UTC

Description bkopilov 2015-02-03 11:30:22 UTC
Description of problem:
Openstack installed on rhel7.1 with rhos6 .

I run tempest code , there is an error while trying to terminate an instance.

From nova-compute logs :
015-02-03 05:51:59.270 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Result was 1 execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:195
2015-02-03 05:51:59.271 30113 DEBUG nova.virt.libvirt.driver [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Could not determine fibre channel world wide node names get_volume_connector /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:1280
2015-02-03 05:51:59.272 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf systool -c fc_host -v execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:161
2015-02-03 05:51:59.496 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Result was 1 execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:195
2015-02-03 05:51:59.498 30113 DEBUG nova.virt.libvirt.driver [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Could not determine fibre channel world wide port names get_volume_connector /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:1287
2015-02-03 05:51:59.498 30113 DEBUG nova.volume.cinder [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Cinderclient connection created using URL: http://10.35.186.75:8776/v1/f4e7b04367eb4cbfb72b9c668ef95e0d get_cinder_client_version /usr/lib/python2.7/site-packages/nova/volume/cinder.py:255
2015-02-03 05:51:59.694 30113 DEBUG nova.volume.cinder [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Cinderclient connection created using URL: http://10.35.186.75:8776/v1/f4e7b04367eb4cbfb72b9c668ef95e0d get_cinder_client_version /usr/lib/python2.7/site-packages/nova/volume/cinder.py:255
2015-02-03 05:52:01.051 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "a961759a-afb1-4720-b922-d230a3155058" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206
2015-02-03 05:52:01.051 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "a961759a-afb1-4720-b922-d230a3155058" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229
2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "do_terminate_instance" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271
2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "a961759a-afb1-4720-b922-d230a3155058-events" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206
2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "a961759a-afb1-4720-b922-d230a3155058-events" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229
2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "_clear_events" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271
2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "a961759a-afb1-4720-b922-d230a3155058-events" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238
2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "_clear_events" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275
2015-02-03 05:52:01.083 30113 AUDIT nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Terminating instance
2015-02-03 05:52:16.214 30113 ERROR nova.virt.libvirt.driver [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Error from libvirt during destroy. Code=38 Error=Failed to terminate process 6329 with SIGKILL: Device or resource busy
2015-02-03 05:52:16.216 30113 DEBUG nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Deallocating network for instance _deallocate_network /usr/lib/python2.7/site-packages/nova/compute/manager.py:1916
2015-02-03 05:52:16.216 30113 DEBUG nova.network.neutronv2.api [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] deallocate_for_instance() deallocate_for_instance /usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py:528
2015-02-03 05:52:16.217 30113 DEBUG neutronclient.client [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] 
REQ: curl -i http://10.35.186.75:9696/v2.0/ports.json?device_id=a961759a-afb1-4720-b922-d230a3155058 -X GET -H "X-Auth-Token: c56320c0c2524825bced0c35f1259f4d" -H "User-Agent: python-neutronclient"
 http_log_req /usr/lib/python2.7/site-packages/neutronclient/common/utils.py:140
2015-02-03 05:52:16.263 30113 DEBUG neutronclient.client [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] RESP:200 {'date': 'Tue, 03 Feb 2015 10:52:16 GMT', 'content-length': '13', 'content-type': 'application/json; charset=UTF-8', 'x-openstack-request-id': 'req-649fc8b3-1bb0-4979-8718-7ad83b3c72c4'} {"ports": []}
 http_log_resp /usr/lib/python2.7/site-packages/neutronclient/common/utils.py:149
2015-02-03 05:52:16.264 30113 DEBUG nova.network.base_api [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] Updating cache with info: [] update_instance_cache_with_nw_info /usr/lib/python2.7/site-packages/nova/network/base_api.py:40
2015-02-03 05:52:16.408 30113 ERROR nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Setting instance vm_state to ERROR
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] Traceback (most recent call last):
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2510, in do_terminate_instance
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     self._delete_instance(context, instance, bdms, quotas)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 131, in inner
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     rv = f(*args, **kwargs)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2479, in _delete_instance
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     quotas.rollback()
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     six.reraise(self.type_, self.value, self.tb)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2456, in _delete_instance
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     self._shutdown_instance(context, instance, bdms)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2386, in _shutdown_instance
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     requested_networks)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     six.reraise(self.type_, self.value, self.tb)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2375, in _shutdown_instance
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     block_device_info)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1055, in destroy
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     self._destroy(instance)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1011, in _destroy
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     instance=instance)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     six.reraise(self.type_, self.value, self.tb)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 980, in _destroy
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     virt_dom.destroy()
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     rv = execute(f, *args, **kwargs)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     six.reraise(c, e, tb)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     rv = meth(*args, **kwargs)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1015, in destroy
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058]     if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] libvirtError: Failed to terminate process 6329 with SIGKILL: Device or resource busy
2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] 
2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "compute_resources" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206
2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "compute_resources" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229
2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271
2015-02-03 05:52:18.220 30113 INFO nova.scheduler.client.report [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] Compute_service record updated for ('cinder-lvm-os-7.1-rhos-6-build-6.novalocal', 'cinder-lvm-os-7.1-rhos-6-build-6.novalocal')
2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "compute_resources" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238
2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275
2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "a961759a-afb1-4720-b922-d230a3155058" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238
2015-02-03 05:52:18.222 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "do_terminate_instance" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275
2015-02-03 05:52:18.739 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "compute_resources" internal_lock /usr/lib/python2

Version-Release number of selected component (if applicable):


Attaching full logs from automation run

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 bkopilov 2015-02-03 11:33:15 UTC
Created attachment 987499 [details]
logs for nova  /var/logs/

Comment 4 Kashyap Chamarthy 2015-02-26 16:33:27 UTC
If possible to perform this test again in a reproducible manner, some log files that'll be useful to narrow down this issue:

  - Sanitized Nova debug logs (preferably contextual on the source and
    destination hosts): compute.log, conductor.log, scheduler.log

    To enable Nova debug logs, add these config attributes:
        . . .  
        verbose = True
        debug = True
        . . .

    Restart Nova services:

        $ openstack-service restart nova

  - /var/log/libvirt/qemu/instance.log -- For both compute nodes involved. 
    This will give us the QEMU CLI that's being used by libvirt

    If the test can be repeated, it'll be useful to enable debug logging
    for libvirt on the Compute nodes involved. As libvirt is the
    component that's doing the heavy lifting under the hood. These
    logs gives us detailed interactions between libvirt and QEMU on
    source and destination Compute nodes. To enable:

     1. In /etc/libvirt/libvirtd.conf, have these two config attributes:

        . . .
        log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util"
        log_outputs="1:file:/var/log/libvirt/libvirtd.log"
        . . .

     2. Restart libvirtd:

        $ systemctl restart libvirtd

     3. Repeat the test.

     4. Capture the libvirt logs (contextual) and attach them as plain
        text to the bug.


NOTE: Do *not* forget to turn off both the Nova and libvirt debugging
controls once the test is finished -- otherwise, it continues to log
heavily and can fill up the disk space quickly.

Comment 5 bkopilov 2015-03-01 07:10:02 UTC
Hi , 
I enabled nova and libvirt debug . Managed to repordue .
Tempest tests path :  tempest.api.compute.servers.test_servers.* 

Attaching to bug nova logs and libvirt log gz files

Benny

Comment 6 bkopilov 2015-03-01 07:28:57 UTC
Created attachment 996711 [details]
libvirt.log files [gz file]

Comment 7 bkopilov 2015-03-01 07:29:39 UTC
Created attachment 996712 [details]
nova-compute.log  [gz file]

Comment 9 Kashyap Chamarthy 2015-03-26 10:59:02 UTC
First, bkopilov -- please provide a clear reproducer test case *without* 
involving Tempest.

Second, looking at the libvirtd debug logs (strictly speaking, it's not 
libvirt's problem). Here's what's happening:

libvirt sends SIGTERM (i.e. requesting to gracefully terminate), then
waits a few iterations (10 seconds) to see if it dies. If the process
still hasn't exited, and @force is requested (it is requested, as it can
be seen below) a SIGKILL (kill uncoditionally, and the receiving process
will not get a chance to cleanup) be sent, and this will wait upto 5
seconds more for the process to exit before returning.

And, reading a comment from Dan Berrange in a similar upstream bug[1]:

  "Usually when a QEMU process fails to go away upon SIGKILL it is 
  because it is stuck in an uninterruptable kernel sleep waiting on I/O
  from some non-responsive server.

  Given the CPU load of the gate tests though, it is conceivable that 
  the 15 second timeout is too short, particularly if the VM running
  tempest has a high steal time from the cloud host. ie 15 wallclock 
  seconds may have passed, but the VM might have only have a few seconds
  of scheduled run time."


[. . .]
2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessLine:183 : Line [{"timestamp": {"seconds": 1425193482, "microseconds": 143626}, "event": "SHUTDOWN"}]
2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessLine:198 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa21800cff0 event={"timestamp": {"seconds": 1425193482, "microseconds": 143626
}, "event": "SHUTDOWN"}
2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessEvent:142 : mon=0x7fa21800cff0 obj=0x7fa250706390
2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorEmitEvent:1256 : mon=0x7fa21800cff0 event=SHUTDOWN
2015-03-01 07:04:42.146+0000: 10526: debug : qemuProcessHandleEvent:674 : vm=0x7fa220020020
2015-03-01 07:04:52.165+0000: 10535: debug : virProcessKillPainfully:315 : Timed out waiting after SIGTERM to process 26832, sending SIGKILL
2015-03-01 07:04:57.174+0000: 10535: error : virProcessKillPainfully:345 : Failed to terminate process 26832 with SIGKILL: Device or resource busy
2015-03-01 07:04:57.174+0000: 10535: debug : virDomainFree:2287 : dom=0x7fa22800e5d0, (VM: name=instance-00000041, uuid=3e8c7332-715c-4827-b443-a539a8ae7843)
2015-03-01 07:04:57.174+0000: 10526: debug : virObjectEventNew:647 : obj=0x7fa250721770
2015-03-01 07:04:57.174+0000: 10526: debug : qemuMonitorJSONIOProcessEvent:169 : handle SHUTDOWN handler=0x7fa2383401f0 data=(nil)
2015-03-01 07:04:57.174+0000: 10526: debug : qemuMonitorEmitShutdown:1267 : mon=0x7fa21800cff0
2015-03-01 07:04:57.174+0000: 10526: debug : qemuProcessHandleShutdown:699 : vm=0x7fa220020020
2015-03-01 07:04:57.174+0000: 10526: debug : qemuProcessHandleShutdown:716 : Transitioned guest instance-00000041 to shutdown state
2015-03-01 07:04:57.174+0000: 10526: debug : virObjectEventNew:647 : obj=0x7fa2507209d0
2015-03-01 07:04:57.176+0000: 10526: debug : qemuProcessKill:4969 : vm=0x7fa220020020 name=instance-00000041 pid=26832 flags=2
2015-03-01 07:04:57.176+0000: 10526: debug : qemuMonitorJSONIOProcess:252 : Total used 85 bytes out of 85 available in buffer
2015-03-01 07:04:57.176+0000: 10526: debug : virObjectEventDispose:121 : obj=0x7fa250721770
2015-03-01 07:04:57.176+0000: 10526: debug : virDomainFree:2287 : dom=0x7fa2507216e0, (VM: name=instance-00000041, uuid=3e8c7332-715c-4827-b443-a539a8ae7843)
2015-03-01 07:04:57.176+0000: 10526: debug : virObjectEventDispose:121 : obj=0x7fa2507209d0
2015-03-01 07:04:57.178+0000: 10529: debug : virConnectGetLibVersion:1589 : conn=0x7fa21c000aa0, libVir=0x7fa23f3bcb90
2015-03-01 07:04:57.185+0000: 10534: debug : virDomainLookupByName:2138 : conn=0x7fa21c000aa0, name=instance-00000042
2015-03-01 07:04:57.185+0000: 10534: debug : virDomainFree:2287 : dom=0x7fa208012f40, (VM: name=instance-00000042, uuid=dcf9a1c7-3bfd-4fad-829d-93031a34b047)
2015-03-01 07:04:57.187+0000: 10531: debug : virDomainDestroy:2190 : dom=0x7fa210016d00, (VM: name=instance-00000042, uuid=dcf9a1c7-3bfd-4fad-829d-93031a34b047)
2015-03-01 07:04:57.187+0000: 10531: debug : qemuProcessKill:4969 : vm=0x7fa218012d80 name=instance-00000042 pid=26280 flags=1
2015-03-01 07:04:57.187+0000: 10531: debug : virProcessKillPainfully:299 : vpid=26280 force=1
2015-03-01 07:04:57.189+0000: 10526: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7fa21800aca0 buf={"timestamp": {"seconds": 1425193497, "microseconds": 188370}, "event": "SHUTDOWN"}
[. . .]


[1] https://bugs.launchpad.net/nova/+bug/1353939 -- 
Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

Comment 11 Kashyap Chamarthy 2015-07-09 20:44:17 UTC
As noted in comment #9 of this bug, and the rationale described here by Dan Berrange here -- https://bugzilla.redhat.com/show_bug.cgi?id=1205647#c13, the reason is something in the "host is so overloaded that the kernel was not able to clean up the process in the time that libvirt was prepared to wait."


There's a fix merged upstream, also I back-ported it to stable/Kilo branch upstream[1], that should help alleviate this problem, once it makes into RHOS Nova as part of the next rebase:


commit dc6af6bf861b510834122aa75750fd784578e197
Author: Matt Riedemann <mriedem.com>
Date:   Sun May 10 18:46:37 2015 -0700

    libvirt: handle code=38 + sigkill (ebusy) in destroy()
    
    Handle the libvirt error during destroy when the sigkill fails due to an
    EBUSY. This is taken from a comment by danpb in the bug report as a
    potential workaround.
    
    Co-authored-by: Daniel Berrange (berrange)
    
    Closes-Bug: #1353939
    
    Conflicts:
        nova/tests/unit/virt/libvirt/test_driver.py
    
        NOTE (kashyapc): 'stable/kilo' branch doesn't have the
        'libvirt_guest' object, so, adjust the below unit tests accordingly:
    
            test_private_destroy_ebusy_timeout
            test_private_destroy_ebusy_multiple_attempt_ok
    
    Change-Id: I128bf6b939fbbc85df521fd3fe23c3c6f93b1b2c
    (cherry picked from commit 3907867601d1044eaadebff68a590d176abff6cf)

[1] https://review.openstack.org/#/c/193497

Comment 20 errata-xmlrpc 2016-03-24 13:53:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0507.html


Note You need to log in before you can comment on or make changes to this bug.