+++ This bug was initially created as a clone of Bug #1025025 +++ Description of problem: When the qpidd service is restarted, openstack-cinder-volume requires a restart to reconnect to the qpidd service. Version-Release number of selected component (if applicable): openstack-cinder-2013.1.3-2.el6ost.noarch How reproducible: Every Steps to Reproduce: 1. Start qpidd and openstack-cinder-volume and verify cinder functionality (create/delete volumes, etc.) 2. Restart qpidd service 3. Again, try to create/delete cinder volumes. Actual results: Volumes fail to create/delete - displayed as in error state in dashboard Expected results: Volumes create/delete successfully. Additional info: --- Additional comment from Dan Yocum on 2013-10-30 14:47:10 EDT --- --- Additional comment from Dan Yocum on 2013-10-30 14:47:56 EDT --- --- Additional comment from Eric Harney on 2013-11-13 15:30:17 EST --- Here is my attempt to reproduce this -- I couldn't. No qpid errors generated in /var/log/messages and behavior is as expected. --- Additional comment from Eric Harney on 2013-11-13 15:31:26 EST --- Dan: Any more info you can give about how to reproduce this? It doesn't seem to happen on my machine. (3.0.z on RHEL 6.5 Beta) --- Additional comment from Dan Yocum on 2013-11-20 12:57:11 EST --- Sadly, no, I don't have any more details right now. I'll try to reproduce the problem when I find some spare time (buwahahahaha! I crack me up), but I don't see that happening any time soon. :/ I'm ok with closing the ticket and if/when I run into it again, I'll re-open with more details. --- Additional comment from Eric Harney on 2013-11-27 09:37:01 EST --- Closing since I can't find any way to make this happen. Note that the errors in the volume log appear to be related to driver issues and not AMQP. --- Additional comment from Dan Yocum on 2014-01-08 08:48:03 EST --- Let's leave this open for the next couple of weeks. If/when the problem reproduces itself, we won't touch the volume service and you can log in and see what it's failing on. Sound like a plan? --- Additional comment from Gordon Sim on 2014-01-08 11:22:21 EST --- The qpidd errors sound like it might be https://bugzilla.redhat.com/show_bug.cgi?id=1041575 (see https://bugs.launchpad.net/cinder/+bug/1187298). --- Additional comment from Eric Harney on 2014-01-14 18:04:54 EST --- Bug 1050215 could be related/helpful here. --- Additional comment from Ayal Baron on 2014-02-05 09:29:11 EST --- Since this is a test only bug and the bug it's tracking is closed already as well as us not being able to reproduce it, I'm moving to ON_QA. --- Additional comment from Dafna Ron on 2014-02-19 05:05:09 EST --- This specific issue is not reproduced but we have a different one which I will open a bug for. closing this as verified with the following packages [root@orange-vdsf ~]# rpm -qa |grep cinder openstack-cinder-2013.2.2-1.el6ost.noarch python-cinderclient-1.0.7-2.el6ost.noarch python-cinder-2013.2.2-1.el6ost.noarch root@puma31 ~]# rpm -qa |grep qpid qpid-cpp-client-0.18-14.el6.x86_64 qpid-cpp-server-0.18-14.el6.x86_64 python-qpid-0.18-4.el6.noarch --- Additional comment from errata-xmlrpc on 2014-03-04 15:12:56 EST --- Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0213.html
This seems somehow related to #1060689 Do we have the logs of the server? Can this be reproduced?
Jeff Dexter is working on getting the logs from the system.
/var/log/cinder/volume.log 2014-03-11 14:49:06 ERROR [root] Unexpected exception occurred 1 time(s)... retrying. Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/cinder/openstack/common/excutils.py", line 62, in inner_func return infunc(*args, **kwargs) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 529, in _consumer_thread self.consume() File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 520, in consume it.next() File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 437, in iterconsume yield self.ensure(_error_callback, _consume) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 377, in ensure return method(*args, **kwargs) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 428, in _consume nxt_receiver = self.session.next_receiver(timeout=timeout) File "<string>", line 6, in next_receiver File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 660, in next_receiver if self._ecwait(lambda: self.incoming, timeout): File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 567, in _ewait self.check_error() File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 556, in check_error raise self.error SessionError: Queue cinder-volume has been deleted. (qpid/broker/Queue.cpp:1866)(408) 2014-03-11 14:49:16 INFO [cinder.service] Caught SIGTERM, stopping children 2014-03-11 14:49:16 INFO [cinder.service] Waiting on 1 children to exit 2014-03-11 14:49:16 INFO [cinder.service] Caught SIGTERM, exiting 2014-03-11 14:49:16 INFO [cinder.service] Child 24392 exited with status 1 2014-03-11 14:49:16 INFO [cinder.service] Starting 1 workers 2014-03-11 14:49:16 INFO [cinder.service] Started child 11841 2014-03-11 14:49:16 AUDIT [cinder.service] Starting cinder-volume node (version 2013.1.4) 2014-03-11 14:50:39 INFO [cinder.volume.manager] volume volume-b9a7f284-dfd5-42e7-a157-e4beb7427e0f: skipping export 2014-03-11 14:50:39 INFO [cinder.volume.manager] Updating volume status 2014-03-11 14:50:39 INFO [cinder.openstack.common.rpc.impl_qpid] Connected to AMQP server on public-msg1.os1.phx2.redhat.com:5672 2014-03-11 14:50:39 INFO [cinder.openstack.common.rpc.impl_qpid] Connected to AMQP server on public-msg1.os1.phx2.redhat.com:5672 There are no other red flags around this time, system was not under high load, (confirmed via sar data), no messages in log file
(In reply to Chris Roberts from comment #5) > /var/log/cinder/volume.log > 2014-03-11 14:49:06 ERROR [root] Unexpected exception occurred 1 > time(s)... retrying. > Traceback (most recent call last): > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/excutils.py", line > 62, in inner_func > return infunc(*args, **kwargs) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 529, in _consumer_thread > self.consume() > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 520, in consume > it.next() > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 437, in iterconsume > yield self.ensure(_error_callback, _consume) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 377, in ensure > return method(*args, **kwargs) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 428, in _consume > nxt_receiver = self.session.next_receiver(timeout=timeout) > File "<string>", line 6, in next_receiver > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 660, in next_receiver > if self._ecwait(lambda: self.incoming, timeout): > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 50, in _ecwait > result = self._ewait(lambda: self.closed or predicate(), timeout) > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 567, in _ewait > self.check_error() > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 556, in check_error > raise self.error > SessionError: Queue cinder-volume has been deleted. > (qpid/broker/Queue.cpp:1866)(408) I believe this is https://bugzilla.redhat.com/show_bug.cgi?id=1041575 and should be fixed by upgrading to python-qpid-0.18-8
I can verify that python-qpid-0.18-9 is in NONE of the MRG channels in RHN classic - I've just checked. Nor is it in the ost-3 or ost-4 channels. [root@int-comp040 ~]# rhn-channel -l rhel-x86_64-server-6 rhel-x86_64-server-6-mrg-grid-2 rhel-x86_64-server-6-mrg-grid-execute-2 rhel-x86_64-server-6-mrg-management-2 rhel-x86_64-server-6-mrg-messaging-2 rhel-x86_64-server-6-mrg-realtime-2 rhel-x86_64-server-6-ost-3 rhel-x86_64-server-6-rhscl-1 rhel-x86_64-server-optional-6 rhel-x86_64-server-supplementary-6 [root@int-comp040 ~]# yum list python-qpid\* Loaded plugins: downloadonly, priorities, rhnplugin, security This system is receiving updates from RHN Classic or RHN Satellite. 867 packages excluded due to repository priority protections Installed Packages python-qpid.noarch 0.14-11.el6_3 @rhel-x86_64-server-6 Available Packages python-qpid-proton.x86_64 0.6-2.el6 epel python-qpid-proton-doc.noarch 0.6-2.el6 epel python-qpid-qmf.x86_64 0.14-14.el6_3 rhel-x86_64-server-6
Dan, could python-qpid-0.18-9.el6.noarch be one of the 867 packages excluded due to repository priority protections?
So, yes, I got the yum priorities correct and upgraded qpidd and python-qpid and I'm fairly certain that _this_ issue went away. There are still cinder-qpid issues (openstack-cinder-volume requires a restart periodically) but that issue is not this issue (I don't think).