Bug 1075684
Summary: | openstack-cinder-volume service doesn't reconnect to qpidd after qpidd restart | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Roberts <chrobert> |
Component: | openstack-cinder | Assignee: | Flavio Percoco <fpercoco> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Dafna Ron <dron> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | breeler, chrobert, dyocum, eharney, erich, fpercoco, gsim, jdexter, mlopes, ramon, sclewis, scohen, sgotliv, yeylon |
Target Milestone: | --- | ||
Target Release: | 5.0 (RHEL 7) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | storage | ||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1025025 | Environment: | |
Last Closed: | 2014-05-29 15:38:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Roberts
2014-03-12 15:14:50 UTC
This seems somehow related to #1060689 Do we have the logs of the server? Can this be reproduced? Jeff Dexter is working on getting the logs from the system. /var/log/cinder/volume.log 2014-03-11 14:49:06 ERROR [root] Unexpected exception occurred 1 time(s)... retrying. Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/cinder/openstack/common/excutils.py", line 62, in inner_func return infunc(*args, **kwargs) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 529, in _consumer_thread self.consume() File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 520, in consume it.next() File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 437, in iterconsume yield self.ensure(_error_callback, _consume) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 377, in ensure return method(*args, **kwargs) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", line 428, in _consume nxt_receiver = self.session.next_receiver(timeout=timeout) File "<string>", line 6, in next_receiver File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 660, in next_receiver if self._ecwait(lambda: self.incoming, timeout): File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 567, in _ewait self.check_error() File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 556, in check_error raise self.error SessionError: Queue cinder-volume has been deleted. (qpid/broker/Queue.cpp:1866)(408) 2014-03-11 14:49:16 INFO [cinder.service] Caught SIGTERM, stopping children 2014-03-11 14:49:16 INFO [cinder.service] Waiting on 1 children to exit 2014-03-11 14:49:16 INFO [cinder.service] Caught SIGTERM, exiting 2014-03-11 14:49:16 INFO [cinder.service] Child 24392 exited with status 1 2014-03-11 14:49:16 INFO [cinder.service] Starting 1 workers 2014-03-11 14:49:16 INFO [cinder.service] Started child 11841 2014-03-11 14:49:16 AUDIT [cinder.service] Starting cinder-volume node (version 2013.1.4) 2014-03-11 14:50:39 INFO [cinder.volume.manager] volume volume-b9a7f284-dfd5-42e7-a157-e4beb7427e0f: skipping export 2014-03-11 14:50:39 INFO [cinder.volume.manager] Updating volume status 2014-03-11 14:50:39 INFO [cinder.openstack.common.rpc.impl_qpid] Connected to AMQP server on public-msg1.os1.phx2.redhat.com:5672 2014-03-11 14:50:39 INFO [cinder.openstack.common.rpc.impl_qpid] Connected to AMQP server on public-msg1.os1.phx2.redhat.com:5672 There are no other red flags around this time, system was not under high load, (confirmed via sar data), no messages in log file (In reply to Chris Roberts from comment #5) > /var/log/cinder/volume.log > 2014-03-11 14:49:06 ERROR [root] Unexpected exception occurred 1 > time(s)... retrying. > Traceback (most recent call last): > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/excutils.py", line > 62, in inner_func > return infunc(*args, **kwargs) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 529, in _consumer_thread > self.consume() > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 520, in consume > it.next() > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 437, in iterconsume > yield self.ensure(_error_callback, _consume) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 377, in ensure > return method(*args, **kwargs) > File > "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py", > line 428, in _consume > nxt_receiver = self.session.next_receiver(timeout=timeout) > File "<string>", line 6, in next_receiver > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 660, in next_receiver > if self._ecwait(lambda: self.incoming, timeout): > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 50, in _ecwait > result = self._ewait(lambda: self.closed or predicate(), timeout) > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 567, in _ewait > self.check_error() > File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line > 556, in check_error > raise self.error > SessionError: Queue cinder-volume has been deleted. > (qpid/broker/Queue.cpp:1866)(408) I believe this is https://bugzilla.redhat.com/show_bug.cgi?id=1041575 and should be fixed by upgrading to python-qpid-0.18-8 I can verify that python-qpid-0.18-9 is in NONE of the MRG channels in RHN classic - I've just checked. Nor is it in the ost-3 or ost-4 channels. [root@int-comp040 ~]# rhn-channel -l rhel-x86_64-server-6 rhel-x86_64-server-6-mrg-grid-2 rhel-x86_64-server-6-mrg-grid-execute-2 rhel-x86_64-server-6-mrg-management-2 rhel-x86_64-server-6-mrg-messaging-2 rhel-x86_64-server-6-mrg-realtime-2 rhel-x86_64-server-6-ost-3 rhel-x86_64-server-6-rhscl-1 rhel-x86_64-server-optional-6 rhel-x86_64-server-supplementary-6 [root@int-comp040 ~]# yum list python-qpid\* Loaded plugins: downloadonly, priorities, rhnplugin, security This system is receiving updates from RHN Classic or RHN Satellite. 867 packages excluded due to repository priority protections Installed Packages python-qpid.noarch 0.14-11.el6_3 @rhel-x86_64-server-6 Available Packages python-qpid-proton.x86_64 0.6-2.el6 epel python-qpid-proton-doc.noarch 0.6-2.el6 epel python-qpid-qmf.x86_64 0.14-14.el6_3 rhel-x86_64-server-6 Dan, could python-qpid-0.18-9.el6.noarch be one of the 867 packages excluded due to repository priority protections? So, yes, I got the yum priorities correct and upgraded qpidd and python-qpid and I'm fairly certain that _this_ issue went away. There are still cinder-qpid issues (openstack-cinder-volume requires a restart periodically) but that issue is not this issue (I don't think). |