Bug 1640780 - undercloud: neutron_api unhealthy 'openstack server list' returns 'The server has either erred or is incapable of performing the requested operation'
Summary: undercloud: neutron_api unhealthy 'openstack server list' returns 'The server...
Keywords:
Status: CLOSED DUPLICATE of bug 1631335
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Assaf Muller
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-18 18:12 UTC by Alexander Chuzhoy
Modified: 2019-09-09 13:40 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-19 09:24:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alexander Chuzhoy 2018-10-18 18:12:30 UTC
neutron_api container status shows unhealthy on undercloud.
'openstack server list' returns 'The server has either erred or is incapable of performing the requested operation'

Environment:
openstack-nova-api-18.0.2-0.20181002032919.c2045ed.el7ost.noarch
puppet-nova-13.3.1-0.20181001154308.3f8c3ee.el7ost.noarch
python-nova-18.0.2-0.20181002032919.c2045ed.el7ost.noarch
python-novajoin-1.0.19-0.20180828184454.3d58511.el7ost.noarch
python2-novaclient-11.0.0-0.20180809174649.f1005ce.el7ost.noarch
openstack-nova-common-18.0.2-0.20181002032919.c2045ed.el7ost.noarch
openstack-tripleo-heat-templates-9.0.0-0.20181001174822.90afd18.0rc2.el7ost.noarch
instack-undercloud-9.4.1-0.20180928005746.15cda5a.el7ost.noarch


The undercloud has 24GB of ram



Steps to reproduce:

1. Deployed overcloud.
2. After 48 hours tried to run on undercloud 'openstack server list'

Result:
(undercloud) [stack@undercloud ~]$ openstack server list
The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-b757cf9a-fae9-4020-ae63-8af9f2398c54)

Tried to restart nova containers.

nova_compute didn't come up right and shows:

4745332d2683        10.37.168.131:8787/rhosp14/openstack-nova-compute-ironic:2018-10-10.3         "kolla_start"            47 hours ago        Restarting (0) About a minute ago                       nova_compute



Looking at nova-compute.log:
<body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html> (HTTP 500)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service Traceback (most recent call last):
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 796, in run_service
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     service.start()
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/service.py", line 162, in start
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self.manager.init_host()
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1211, in init_host
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self._init_instance(context, instance)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 958, in _init_instance
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self.driver.plug_vifs(instance, net_info)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1522, in plug_vifs
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self._plug_vifs(node, instance, network_info)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1492, in _plug_vifs
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self._plug_vif(node, port_id)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1451, in _plug_vif
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     port_id, retry_on_conflict=False)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/client_wrapper.py", line 170, in call
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     return self._multi_getattr(client, method)(*args, **kwargs)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 414, in vif_attach
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     self.update(path, data, http_method="POST")
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 359, in update
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     params=params)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/common/base.py", line 232, in _update
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     resp, body = self.api.json_request(method, url, **kwargs)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 678, in json_request
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     resp = self._http_request(url, method, **kwargs)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 287, in wrapper
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     return func(self, url, method, **kwargs)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 660, in _http_request
2018-10-18 13:57:18.799 1 ERROR oslo_service.service     error_json.get('debuginfo'), method, url)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service InternalServerError: Could not retrieve network list: <html><body><h1>503 Service Unavailable</h1>
2018-10-18 13:57:18.799 1 ERROR oslo_service.service No server is available to handle this request.
2018-10-18 13:57:18.799 1 ERROR oslo_service.service </body></html> (HTTP 500)
2018-10-18 13:57:18.799 1 ERROR oslo_service.service 








 [root@undercloud ~]# sudo docker ps|grep neutron
3baa5bffd6e2        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "ip netns exec qdh..."   47 hours ago        Up 47 hours                                        neutron-dnsmasq-qdhcp-505bd9c4-bc42-475e-a96a-e587bf9ff5d0
623d64ddfb61        10.37.168.131:8787/rhosp14/openstack-ironic-neutron-agent:2018-10-10.3        "kolla_start"            47 hours ago        Up 47 hours                                        ironic_neutron_agent
a9f1c5f794de        10.37.168.131:8787/rhosp14/openstack-neutron-openvswitch-agent:2018-10-10.3   "kolla_start"            47 hours ago        Up 47 hours (healthy)                              neutron_ovs_agent
2fc0b8b260fd        10.37.168.131:8787/rhosp14/openstack-neutron-l3-agent:2018-10-10.3            "kolla_start"            47 hours ago        Up 47 hours (healthy)                              neutron_l3_agent
147849fd3841        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "kolla_start"            47 hours ago        Up 47 hours (healthy)                              neutron_dhcp
4eb0e9b7b2a0        10.37.168.131:8787/rhosp14/openstack-neutron-server:2018-10-10.3              "kolla_start"            47 hours ago        Up 47 hours (unhealthy)                            neutron_api
 [root@undercloud ~]# sudo docker restart neutron_api

neutron_api
 [root@undercloud ~]# 

 [root@undercloud ~]# sudo docker ps|grep neutron
173be23da58d        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "ip netns exec qdh..."   3 seconds ago       Up 2 seconds                                           neutron-dnsmasq-qdhcp-505bd9c4-bc42-475e-a96a-e587bf9ff5d0
623d64ddfb61        10.37.168.131:8787/rhosp14/openstack-ironic-neutron-agent:2018-10-10.3        "kolla_start"            47 hours ago        Up 47 hours                                            ironic_neutron_agent
a9f1c5f794de        10.37.168.131:8787/rhosp14/openstack-neutron-openvswitch-agent:2018-10-10.3   "kolla_start"            47 hours ago        Up 47 hours (healthy)                                  neutron_ovs_agent
2fc0b8b260fd        10.37.168.131:8787/rhosp14/openstack-neutron-l3-agent:2018-10-10.3            "kolla_start"            47 hours ago        Up 47 hours (healthy)                                  neutron_l3_agent
147849fd3841        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "kolla_start"            47 hours ago        Up 47 hours (healthy)                                  neutron_dhcp
4eb0e9b7b2a0        10.37.168.131:8787/rhosp14/openstack-neutron-server:2018-10-10.3              "kolla_start"            47 hours ago        Up 20 seconds (health: starting)                       neutron_api



(undercloud) [stack@undercloud ~]$ sudo docker ps |grep neutron
173be23da58d        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "ip netns exec qdh..."   7 minutes ago       Up 7 minutes                                    neutron-dnsmasq-qdhcp-505bd9c4-bc42-475e-a96a-e587bf9ff5d0
623d64ddfb61        10.37.168.131:8787/rhosp14/openstack-ironic-neutron-agent:2018-10-10.3        "kolla_start"            47 hours ago        Up 47 hours                                     ironic_neutron_agent
a9f1c5f794de        10.37.168.131:8787/rhosp14/openstack-neutron-openvswitch-agent:2018-10-10.3   "kolla_start"            47 hours ago        Up 47 hours (healthy)                           neutron_ovs_agent
2fc0b8b260fd        10.37.168.131:8787/rhosp14/openstack-neutron-l3-agent:2018-10-10.3            "kolla_start"            47 hours ago        Up 47 hours (healthy)                           neutron_l3_agent
147849fd3841        10.37.168.131:8787/rhosp14/openstack-neutron-dhcp-agent:2018-10-10.3          "kolla_start"            47 hours ago        Up 47 hours (healthy)                           neutron_dhcp
4eb0e9b7b2a0        10.37.168.131:8787/rhosp14/openstack-neutron-server:2018-10-10.3              "kolla_start"            47 hours ago        Up 7 minutes (healthy)                          neutron_api


Looking for neutron errors /var/log/containers/neutron/l3-agent.log

2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID 8cba2ea7f3a24d9589567a787fb66254
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 755, in _report_state
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     True)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 97, in report_state
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     return method(context, 'report_state', **kwargs)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in call
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     retry=self.retry)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in _send
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     retry=retry)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     call_monitor_timeout, retry=retry)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 573, in _send
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     call_monitor_timeout)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in wait
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     message = self.waiters.get(msg_id, timeout=timeout)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in get
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent     'to message ID %s' % msg_id)
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent MessagingTimeout: Timed out waiting for a reply to message ID 8cba2ea7f3a24d9589567a787fb66254
2018-10-18 14:00:57.672 42342 ERROR neutron.agent.l3.agent 


/var/log/containers/neutron/openvswitch-agent.log

2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID
 f0b874e241284900ba430879f66b4656
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs
_neutron_agent.py", line 325, in _report_state
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     True)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 97, in report_state
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     return method(context, 'report_state', **kwargs)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in call
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=self.retry)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in _send
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=retry)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 584,
 in send
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout, retry=retry)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 573,
 in _send
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 459,
 in wait
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     message = self.waiters.get(msg_id, timeout=timeout)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 336,
 in get
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     'to message ID %s' % msg_id)
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID f0b874e241284900ba430879f66b
4656
2018-10-18 14:00:57.680 42736 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent

Comment 2 Bernard Cafarelli 2018-10-19 09:24:05 UTC
This is bug #1631335, indeed restarting affected containers is the current workaround for older test versions.

*** This bug has been marked as a duplicate of bug 1631335 ***


Note You need to log in before you can comment on or make changes to this bug.