Description of problem: Openstack installed on rhel7.1 with rhos6 . I run tempest code , there is an error while trying to terminate an instance. From nova-compute logs : 015-02-03 05:51:59.270 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Result was 1 execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:195 2015-02-03 05:51:59.271 30113 DEBUG nova.virt.libvirt.driver [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Could not determine fibre channel world wide node names get_volume_connector /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:1280 2015-02-03 05:51:59.272 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf systool -c fc_host -v execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:161 2015-02-03 05:51:59.496 30113 DEBUG nova.openstack.common.processutils [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Result was 1 execute /usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py:195 2015-02-03 05:51:59.498 30113 DEBUG nova.virt.libvirt.driver [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Could not determine fibre channel world wide port names get_volume_connector /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:1287 2015-02-03 05:51:59.498 30113 DEBUG nova.volume.cinder [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Cinderclient connection created using URL: http://10.35.186.75:8776/v1/f4e7b04367eb4cbfb72b9c668ef95e0d get_cinder_client_version /usr/lib/python2.7/site-packages/nova/volume/cinder.py:255 2015-02-03 05:51:59.694 30113 DEBUG nova.volume.cinder [req-ae1ecc18-ee36-4a3c-9fe9-ca5d3e296a09 None] Cinderclient connection created using URL: http://10.35.186.75:8776/v1/f4e7b04367eb4cbfb72b9c668ef95e0d get_cinder_client_version /usr/lib/python2.7/site-packages/nova/volume/cinder.py:255 2015-02-03 05:52:01.051 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "a961759a-afb1-4720-b922-d230a3155058" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206 2015-02-03 05:52:01.051 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "a961759a-afb1-4720-b922-d230a3155058" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229 2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "do_terminate_instance" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271 2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "a961759a-afb1-4720-b922-d230a3155058-events" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206 2015-02-03 05:52:01.052 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "a961759a-afb1-4720-b922-d230a3155058-events" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229 2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "_clear_events" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271 2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "a961759a-afb1-4720-b922-d230a3155058-events" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238 2015-02-03 05:52:01.053 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "_clear_events" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275 2015-02-03 05:52:01.083 30113 AUDIT nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Terminating instance 2015-02-03 05:52:16.214 30113 ERROR nova.virt.libvirt.driver [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Error from libvirt during destroy. Code=38 Error=Failed to terminate process 6329 with SIGKILL: Device or resource busy 2015-02-03 05:52:16.216 30113 DEBUG nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Deallocating network for instance _deallocate_network /usr/lib/python2.7/site-packages/nova/compute/manager.py:1916 2015-02-03 05:52:16.216 30113 DEBUG nova.network.neutronv2.api [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] deallocate_for_instance() deallocate_for_instance /usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py:528 2015-02-03 05:52:16.217 30113 DEBUG neutronclient.client [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] REQ: curl -i http://10.35.186.75:9696/v2.0/ports.json?device_id=a961759a-afb1-4720-b922-d230a3155058 -X GET -H "X-Auth-Token: c56320c0c2524825bced0c35f1259f4d" -H "User-Agent: python-neutronclient" http_log_req /usr/lib/python2.7/site-packages/neutronclient/common/utils.py:140 2015-02-03 05:52:16.263 30113 DEBUG neutronclient.client [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] RESP:200 {'date': 'Tue, 03 Feb 2015 10:52:16 GMT', 'content-length': '13', 'content-type': 'application/json; charset=UTF-8', 'x-openstack-request-id': 'req-649fc8b3-1bb0-4979-8718-7ad83b3c72c4'} {"ports": []} http_log_resp /usr/lib/python2.7/site-packages/neutronclient/common/utils.py:149 2015-02-03 05:52:16.264 30113 DEBUG nova.network.base_api [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] Updating cache with info: [] update_instance_cache_with_nw_info /usr/lib/python2.7/site-packages/nova/network/base_api.py:40 2015-02-03 05:52:16.408 30113 ERROR nova.compute.manager [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] [instance: a961759a-afb1-4720-b922-d230a3155058] Setting instance vm_state to ERROR 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] Traceback (most recent call last): 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2510, in do_terminate_instance 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] self._delete_instance(context, instance, bdms, quotas) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 131, in inner 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] rv = f(*args, **kwargs) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2479, in _delete_instance 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] quotas.rollback() 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] six.reraise(self.type_, self.value, self.tb) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2456, in _delete_instance 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] self._shutdown_instance(context, instance, bdms) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2386, in _shutdown_instance 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] requested_networks) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] six.reraise(self.type_, self.value, self.tb) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2375, in _shutdown_instance 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] block_device_info) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1055, in destroy 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] self._destroy(instance) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1011, in _destroy 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] instance=instance) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] six.reraise(self.type_, self.value, self.tb) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 980, in _destroy 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] virt_dom.destroy() 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] result = proxy_call(self._autowrap, f, *args, **kwargs) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] rv = execute(f, *args, **kwargs) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] six.reraise(c, e, tb) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] rv = meth(*args, **kwargs) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1015, in destroy 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self) 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] libvirtError: Failed to terminate process 6329 with SIGKILL: Device or resource busy 2015-02-03 05:52:16.408 30113 TRACE nova.compute.manager [instance: a961759a-afb1-4720-b922-d230a3155058] 2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "compute_resources" internal_lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:206 2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Acquired semaphore "compute_resources" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:229 2015-02-03 05:52:18.056 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Got semaphore / lock "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:271 2015-02-03 05:52:18.220 30113 INFO nova.scheduler.client.report [req-b8415789-5391-4ce7-816f-e5f4c6507214 None] Compute_service record updated for ('cinder-lvm-os-7.1-rhos-6-build-6.novalocal', 'cinder-lvm-os-7.1-rhos-6-build-6.novalocal') 2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "compute_resources" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238 2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275 2015-02-03 05:52:18.221 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Releasing semaphore "a961759a-afb1-4720-b922-d230a3155058" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:238 2015-02-03 05:52:18.222 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Semaphore / lock released "do_terminate_instance" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:275 2015-02-03 05:52:18.739 30113 DEBUG nova.openstack.common.lockutils [req-b8415789-5391-4ce7-816f-e5f4c6507214 ] Created new semaphore "compute_resources" internal_lock /usr/lib/python2 Version-Release number of selected component (if applicable): Attaching full logs from automation run How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 987499 [details] logs for nova /var/logs/
If possible to perform this test again in a reproducible manner, some log files that'll be useful to narrow down this issue: - Sanitized Nova debug logs (preferably contextual on the source and destination hosts): compute.log, conductor.log, scheduler.log To enable Nova debug logs, add these config attributes: . . . verbose = True debug = True . . . Restart Nova services: $ openstack-service restart nova - /var/log/libvirt/qemu/instance.log -- For both compute nodes involved. This will give us the QEMU CLI that's being used by libvirt If the test can be repeated, it'll be useful to enable debug logging for libvirt on the Compute nodes involved. As libvirt is the component that's doing the heavy lifting under the hood. These logs gives us detailed interactions between libvirt and QEMU on source and destination Compute nodes. To enable: 1. In /etc/libvirt/libvirtd.conf, have these two config attributes: . . . log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util" log_outputs="1:file:/var/log/libvirt/libvirtd.log" . . . 2. Restart libvirtd: $ systemctl restart libvirtd 3. Repeat the test. 4. Capture the libvirt logs (contextual) and attach them as plain text to the bug. NOTE: Do *not* forget to turn off both the Nova and libvirt debugging controls once the test is finished -- otherwise, it continues to log heavily and can fill up the disk space quickly.
Hi , I enabled nova and libvirt debug . Managed to repordue . Tempest tests path : tempest.api.compute.servers.test_servers.* Attaching to bug nova logs and libvirt log gz files Benny
Created attachment 996711 [details] libvirt.log files [gz file]
Created attachment 996712 [details] nova-compute.log [gz file]
First, bkopilov -- please provide a clear reproducer test case *without* involving Tempest. Second, looking at the libvirtd debug logs (strictly speaking, it's not libvirt's problem). Here's what's happening: libvirt sends SIGTERM (i.e. requesting to gracefully terminate), then waits a few iterations (10 seconds) to see if it dies. If the process still hasn't exited, and @force is requested (it is requested, as it can be seen below) a SIGKILL (kill uncoditionally, and the receiving process will not get a chance to cleanup) be sent, and this will wait upto 5 seconds more for the process to exit before returning. And, reading a comment from Dan Berrange in a similar upstream bug[1]: "Usually when a QEMU process fails to go away upon SIGKILL it is because it is stuck in an uninterruptable kernel sleep waiting on I/O from some non-responsive server. Given the CPU load of the gate tests though, it is conceivable that the 15 second timeout is too short, particularly if the VM running tempest has a high steal time from the cloud host. ie 15 wallclock seconds may have passed, but the VM might have only have a few seconds of scheduled run time." [. . .] 2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessLine:183 : Line [{"timestamp": {"seconds": 1425193482, "microseconds": 143626}, "event": "SHUTDOWN"}] 2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessLine:198 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa21800cff0 event={"timestamp": {"seconds": 1425193482, "microseconds": 143626 }, "event": "SHUTDOWN"} 2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorJSONIOProcessEvent:142 : mon=0x7fa21800cff0 obj=0x7fa250706390 2015-03-01 07:04:42.146+0000: 10526: debug : qemuMonitorEmitEvent:1256 : mon=0x7fa21800cff0 event=SHUTDOWN 2015-03-01 07:04:42.146+0000: 10526: debug : qemuProcessHandleEvent:674 : vm=0x7fa220020020 2015-03-01 07:04:52.165+0000: 10535: debug : virProcessKillPainfully:315 : Timed out waiting after SIGTERM to process 26832, sending SIGKILL 2015-03-01 07:04:57.174+0000: 10535: error : virProcessKillPainfully:345 : Failed to terminate process 26832 with SIGKILL: Device or resource busy 2015-03-01 07:04:57.174+0000: 10535: debug : virDomainFree:2287 : dom=0x7fa22800e5d0, (VM: name=instance-00000041, uuid=3e8c7332-715c-4827-b443-a539a8ae7843) 2015-03-01 07:04:57.174+0000: 10526: debug : virObjectEventNew:647 : obj=0x7fa250721770 2015-03-01 07:04:57.174+0000: 10526: debug : qemuMonitorJSONIOProcessEvent:169 : handle SHUTDOWN handler=0x7fa2383401f0 data=(nil) 2015-03-01 07:04:57.174+0000: 10526: debug : qemuMonitorEmitShutdown:1267 : mon=0x7fa21800cff0 2015-03-01 07:04:57.174+0000: 10526: debug : qemuProcessHandleShutdown:699 : vm=0x7fa220020020 2015-03-01 07:04:57.174+0000: 10526: debug : qemuProcessHandleShutdown:716 : Transitioned guest instance-00000041 to shutdown state 2015-03-01 07:04:57.174+0000: 10526: debug : virObjectEventNew:647 : obj=0x7fa2507209d0 2015-03-01 07:04:57.176+0000: 10526: debug : qemuProcessKill:4969 : vm=0x7fa220020020 name=instance-00000041 pid=26832 flags=2 2015-03-01 07:04:57.176+0000: 10526: debug : qemuMonitorJSONIOProcess:252 : Total used 85 bytes out of 85 available in buffer 2015-03-01 07:04:57.176+0000: 10526: debug : virObjectEventDispose:121 : obj=0x7fa250721770 2015-03-01 07:04:57.176+0000: 10526: debug : virDomainFree:2287 : dom=0x7fa2507216e0, (VM: name=instance-00000041, uuid=3e8c7332-715c-4827-b443-a539a8ae7843) 2015-03-01 07:04:57.176+0000: 10526: debug : virObjectEventDispose:121 : obj=0x7fa2507209d0 2015-03-01 07:04:57.178+0000: 10529: debug : virConnectGetLibVersion:1589 : conn=0x7fa21c000aa0, libVir=0x7fa23f3bcb90 2015-03-01 07:04:57.185+0000: 10534: debug : virDomainLookupByName:2138 : conn=0x7fa21c000aa0, name=instance-00000042 2015-03-01 07:04:57.185+0000: 10534: debug : virDomainFree:2287 : dom=0x7fa208012f40, (VM: name=instance-00000042, uuid=dcf9a1c7-3bfd-4fad-829d-93031a34b047) 2015-03-01 07:04:57.187+0000: 10531: debug : virDomainDestroy:2190 : dom=0x7fa210016d00, (VM: name=instance-00000042, uuid=dcf9a1c7-3bfd-4fad-829d-93031a34b047) 2015-03-01 07:04:57.187+0000: 10531: debug : qemuProcessKill:4969 : vm=0x7fa218012d80 name=instance-00000042 pid=26280 flags=1 2015-03-01 07:04:57.187+0000: 10531: debug : virProcessKillPainfully:299 : vpid=26280 force=1 2015-03-01 07:04:57.189+0000: 10526: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7fa21800aca0 buf={"timestamp": {"seconds": 1425193497, "microseconds": 188370}, "event": "SHUTDOWN"} [. . .] [1] https://bugs.launchpad.net/nova/+bug/1353939 -- Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log
As noted in comment #9 of this bug, and the rationale described here by Dan Berrange here -- https://bugzilla.redhat.com/show_bug.cgi?id=1205647#c13, the reason is something in the "host is so overloaded that the kernel was not able to clean up the process in the time that libvirt was prepared to wait." There's a fix merged upstream, also I back-ported it to stable/Kilo branch upstream[1], that should help alleviate this problem, once it makes into RHOS Nova as part of the next rebase: commit dc6af6bf861b510834122aa75750fd784578e197 Author: Matt Riedemann <mriedem.com> Date: Sun May 10 18:46:37 2015 -0700 libvirt: handle code=38 + sigkill (ebusy) in destroy() Handle the libvirt error during destroy when the sigkill fails due to an EBUSY. This is taken from a comment by danpb in the bug report as a potential workaround. Co-authored-by: Daniel Berrange (berrange) Closes-Bug: #1353939 Conflicts: nova/tests/unit/virt/libvirt/test_driver.py NOTE (kashyapc): 'stable/kilo' branch doesn't have the 'libvirt_guest' object, so, adjust the below unit tests accordingly: test_private_destroy_ebusy_timeout test_private_destroy_ebusy_multiple_attempt_ok Change-Id: I128bf6b939fbbc85df521fd3fe23c3c6f93b1b2c (cherry picked from commit 3907867601d1044eaadebff68a590d176abff6cf) [1] https://review.openstack.org/#/c/193497
With latest build the test is passing https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS7/job/rhos-jenkins-rhos-7.0-puddle-rhel-7.2-3networkers-packstack-neutron-ml2-vxlan-rabbitmq-tempest-git-all/34/testReport/ code is in openstack-nova-scheduler-2015.1.3-7.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0507.html