Bug 858530

Summary: instance enters error state and is undeleteable
Product: [Fedora] Fedora Reporter: Steven Dake <sdake>
Component: openstack-novaAssignee: Mark McLoughlin <markmc>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: akscram, alexander.sakhnov, apevec, asalkeld, bfilippov, breu, Jan.van.Eldik, jonathansteffan, jose.castro.leon, markmc, matt_domsch, mlvov, p, rbryant, rkukura, shardy
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-31 23:34:00 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
compute log none

Description Steven Dake 2012-09-18 23:08:03 EDT
Description of problem:
the nova delete command fails to delete a VM.  This problem happens occasionally, but I'm not entirely sure what triggers it.  I am being a bit more aggressive about entering bug reports as per a conversation with Padraig, even though the information may not be sufficient to resolve the problem.

Version-Release number of selected component (if applicable):
openstack-nova-compute-2012.1.1-15.fc17.noarch


How reproducible:
low but kills infrastructure when occurs

Steps to Reproduce:
1. not certain
2. using heat for several days at a time seems to trigger openstack badness
3.
  
Actual results:
nova delete does not delete a vm

Expected results:
nova delete deletes a vm

Additional info:
[sdake@bigiron noarch]$ nova list
+--------------------------------------+-------------------------------------------+--------+------------------+
|                  ID                  |                    Name                   | Status |     Networks     |
+--------------------------------------+-------------------------------------------+--------+------------------+
| c10bbd93-0429-4d69-89e2-42cfa131f1ce | teststack.ElasticLoadBalancer.LB_instance | ERROR  | demonet=10.0.0.2 |
+--------------------------------------+-------------------------------------------+--------+------------------+


I have attached the compute.log which has some backtraces.

2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid Traceback (most recent call last):
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 364, in ensure
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid     return method(*args, **kwargs)
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 413, in _consume
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid     nxt_receiver = self.session.next_receiver(timeout=timeout)
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid   File "<string>", line 6, in next_receiver
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 663, in next_receiver
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid     raise Empty
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid Empty: None
2012-09-18 19:56:59 TRACE nova.rpc.impl_qpid

And different one:

2012-09-18 19:56:59 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-09-18 19:56:59 TRACE nova.rpc.amqp     rval = node_func(context=ctxt, **node_args)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/exception.py", line 114, in wrapped
2012-09-18 19:56:59 TRACE nova.rpc.amqp     return f(*args, **kw)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 159, in decorated_function
2012-09-18 19:56:59 TRACE nova.rpc.amqp     function(self, context, instance_uuid, *args, **kwargs)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 183, in decorated_function
2012-09-18 19:56:59 TRACE nova.rpc.amqp     sys.exc_info())
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self.gen.next()
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 177, in decorated_function
2012-09-18 19:56:59 TRACE nova.rpc.amqp     return function(self, context, instance_uuid, *args, **kwargs)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 771, in terminate_instance
2012-09-18 19:56:59 TRACE nova.rpc.amqp     do_terminate_instance()
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/utils.py", line 946, in inner
2012-09-18 19:56:59 TRACE nova.rpc.amqp     retval = f(*args, **kwargs)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 764, in do_terminate_instance
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self._delete_instance(context, instance)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 742, in _delete_instance
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self._shutdown_instance(context, instance, 'Terminating')
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 704, in _shutdown_instance
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self._deallocate_network(context, instance)

2012-09-18 19:56:59 TRACE nova.rpc.amqp     self._deallocate_network(context, instance)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 636, in _deallocate_network
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self.network_api.deallocate_for_instance(context, instance)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/network/api.py", line 190, in deallocate_for_instance
2012-09-18 19:56:59 TRACE nova.rpc.amqp     'args': args})
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/__init__.py", line 68, in call
2012-09-18 19:56:59 TRACE nova.rpc.amqp     return _get_impl().call(context, topic, msg, timeout)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 526, in call
2012-09-18 19:56:59 TRACE nova.rpc.amqp     return rpc_amqp.call(context, topic, msg, timeout, Connection.pool)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/amqp.py", line 343, in call
2012-09-18 19:56:59 TRACE nova.rpc.amqp     rv = list(rv)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/amqp.py", line 304, in __iter__
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self.done()
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self.gen.next()
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/amqp.py", line 301, in __iter__
2012-09-18 19:56:59 TRACE nova.rpc.amqp     self._iterator.next()
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 422, in iterconsume
2012-09-18 19:56:59 TRACE nova.rpc.amqp     yield self.ensure(_error_callback, _consume)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 368, in ensure
2012-09-18 19:56:59 TRACE nova.rpc.amqp     error_callback(e)
2012-09-18 19:56:59 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/site-packages/nova/rpc/impl_qpid.py", line 407, in _error_callback
2012-09-18 19:56:59 TRACE nova.rpc.amqp     raise rpc_common.Timeout()
2012-09-18 19:56:59 TRACE nova.rpc.amqp Timeout: Timeout while waiting on RPC response.
2012-09-18 19:56:59 TRACE nova.rpc.amqp
Comment 1 Steven Dake 2012-09-18 23:09:32 EDT
Created attachment 614189 [details]
compute log
Comment 2 Steven Dake 2012-09-18 23:12:14 EDT
restarting openstack unwedges the nova delete, and nova delete works properly.

Using the heat/tools/openstack stop followed by heat/tools/openstack/start script.
Comment 3 Pádraig Brady 2012-09-19 04:50:40 EDT
Hmm the qpid timeouts may match with https://bugs.launchpad.net/openstack-common/+bug/1050661

Could you try this and then restart nova:

openstack-config --set /etc/nova/nova.conf DEFAULT qpid_heartbeat 60
Comment 4 Steven Dake 2012-09-19 15:49:30 EDT
Padraig,

I will run with that setting in the future and report again in this issue if the same backtrace occurs again.
Comment 5 Fedora End Of Life 2013-07-03 20:08:59 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 6 Fedora End Of Life 2013-07-31 23:34:06 EDT
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.