Bug 1108959 - nova-conductor hang with RPC errors
Summary: nova-conductor hang with RPC errors
Keywords:
Status: CLOSED DUPLICATE of bug 1085006
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 4.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: RHOS Maint
QA Contact: Ami Jeain
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-13 01:58 UTC by Ian Wienand
Modified: 2019-09-09 16:35 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 08:23:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ian Wienand 2014-06-13 01:58:21 UTC
My vm is stuck in scheduling state

---
[root@osdashboard dashboard(keystone_iwienand)]$ nova list
+--------------------------------------+-----------------------------------------------------------+--------+------------+-------------+----------+
| ID                                   | Name                                                      | Status | Task State | Power State | Networks |
+--------------------------------------+-----------------------------------------------------------+--------+------------+-------------+----------+
| 82666d0d-2566-4cc2-b057-0b06acc60f4b | gerritbot_99779_devstack_OSLAB_RHEL65_20140613012350_2764 | BUILD  | scheduling | NOSTATE     |          |
+--------------------------------------+-----------------------------------------------------------+--------+------------+-------------+----------+
---

there is no entry in /var/log/nova/scheduler.log

---
[root@host02 nova]# tail -n1 scheduler.log
2014-06-12 22:47:58.526 28017 INFO nova.scheduler.filter_scheduler [req-78a0575f-498a-4116-94d2-8daadb58db1d 54b9f83693c84bf2b72286e9609eee36 210a99a1e68f43218f4cab705c908d45] Choosing host WeighedHost [host: host12.oslab.priv, weight: 12689.0] for instance a265f82e-db86-477c-a0f8-e0d3a42c651b
[root@host02 nova]# date
Fri Jun 13 01:33:51 UTC 2014
[root@host02 nova]# 
---

Upon investigation, i can see the conductor queue is not getting processed

---
[root@host02 log]# qpid-queue-stats conductor
Queue Name                                     Sec       Depth     Enq Rate     Deq Rate
========================================================================================
22dda29c-7712-4846-8e9c-cabed3066f52:0.2      60.00          0         0.02         0.02
conductor                                      9.94       3198         0.10         0.00
q-plugin                                       9.93          0         3.12         3.12
qmfc-v2-ui-host02.oslab.priv.29131.1           9.93          0         0.10         0.10
topic-host02.oslab.priv.29131.1                9.93          0         0.20         0.20
conductor                                     10.00       3199         0.10         0.00
q-plugin                                      10.00          0         2.90         2.90
qmfc-v2-ui-host02.oslab.priv.29131.1          10.00          0         0.10         0.10
topic-host02.oslab.priv.29131.1               10.00          0         0.20         0.20
---

checking the nova-conductor logs, it has stopped.  The first error was 

---
2014-06-12 22:52:55.353 27944 ERROR root [-] Unexpected exception occurred 1 time(s)... retrying.
2014-06-12 22:52:55.353 27944 TRACE root Traceback (most recent call last):
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 78, in inner_func
2014-06-12 22:52:55.353 27944 TRACE root     return infunc(*args, **kwargs)
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 698, in _consumer_thread
2014-06-12 22:52:55.353 27944 TRACE root     self.consume()
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 689, in consume
2014-06-12 22:52:55.353 27944 TRACE root     it.next()
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 606, in iterconsume
2014-06-12 22:52:55.353 27944 TRACE root     yield self.ensure(_error_callback, _consume)
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 540, in ensure
2014-06-12 22:52:55.353 27944 TRACE root     return method(*args, **kwargs)
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 597, in _consume
2014-06-12 22:52:55.353 27944 TRACE root     nxt_receiver = self.session.next_receiver(timeout=timeout)
2014-06-12 22:52:55.353 27944 TRACE root   File "<string>", line 6, in next_receiver
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 660, in next_receiver
2014-06-12 22:52:55.353 27944 TRACE root     if self._ecwait(lambda: self.incoming, timeout):
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait
2014-06-12 22:52:55.353 27944 TRACE root     result = self._ewait(lambda: self.closed or predicate(), timeout)
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 566, in _ewait
2014-06-12 22:52:55.353 27944 TRACE root     result = self.connection._ewait(lambda: self.error or predicate(), timeout)
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 209, in _ewait
2014-06-12 22:52:55.353 27944 TRACE root     self.check_error()
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 202, in check_error
2014-06-12 22:52:55.353 27944 TRACE root     raise self.error
2014-06-12 22:52:55.353 27944 TRACE root InternalError: Traceback (most recent call last):
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/messaging/driver.py", line 651, in write
2014-06-12 22:52:55.353 27944 TRACE root     self._op_dec.write(*self._seg_dec.read())
2014-06-12 22:52:55.353 27944 TRACE root   File "/usr/lib/python2.6/site-packages/qpid/framing.py", line 273, in write
2014-06-12 22:52:55.353 27944 TRACE root     if self.op.payload is None:
2014-06-12 22:52:55.353 27944 TRACE root AttributeError: 'NoneType' object has no attribute 'payload'
---

this repeats constantly in the logs

Versions of installed components

---
[root@host02 log]# rpm -qa | egrep "openstack|qpid"
openstack-ceilometer-common-2013.2.3-1.el6ost.noarch
openstack-dashboard-theme-2013.2.3-1.el6ost.noarch
openstack-neutron-2013.2.3-7.el6ost.noarch
openstack-nova-novncproxy-2013.2.3-7.el6ost.noarch
python-qpid-0.14-11.el6_3.noarch
redhat-access-plugin-openstack-4.0.0-0.el6ost.noarch
openstack-ceilometer-central-2013.2.3-1.el6ost.noarch
openstack-nova-api-2013.2.3-7.el6ost.noarch
openstack-heat-common-2013.2.3-1.el6ost.noarch
python-qpid-qmf-0.14-14.el6_3.x86_64
openstack-ceilometer-api-2013.2.3-1.el6ost.noarch
openstack-nova-console-2013.2.3-7.el6ost.noarch
python-django-openstack-auth-1.1.2-2.el6ost.noarch
qpid-qmf-0.14-14.el6_3.x86_64
openstack-neutron-openvswitch-2013.2.3-7.el6ost.noarch
openstack-ceilometer-collector-2013.2.3-1.el6ost.noarch
openstack-heat-engine-2013.2.3-1.el6ost.noarch
openstack-nova-conductor-2013.2.3-7.el6ost.noarch
openstack-cinder-2013.2.3-2.el6ost.noarch
openstack-keystone-2013.2.3-4.el6ost.noarch
openstack-heat-api-2013.2.3-1.el6ost.noarch
openstack-glance-2013.2.3-3.el6ost.noarch
qpid-tools-0.14-6.el6_3.noarch
openstack-dashboard-2013.2.3-1.el6ost.noarch
qpid-cpp-server-0.14-22.el6_3.x86_64
openstack-nova-scheduler-2013.2.3-7.el6ost.noarch
openstack-utils-2013.2-3.el6ost.noarch
qpid-cpp-client-0.14-22.el6_3.x86_64
openstack-nova-common-2013.2.3-7.el6ost.noarch
openstack-nova-cert-2013.2.3-7.el6ost.noarch
---

Comment 1 Ian Wienand 2014-06-13 02:03:06 UTC
This kind of looks like bug #1085006

Comment 3 Ihar Hrachyshka 2014-06-13 08:23:19 UTC
It *is* the same bug.

*** This bug has been marked as a duplicate of bug 1085006 ***


Note You need to log in before you can comment on or make changes to this bug.