Hide Forgot
Description of problem: Environment: - Memcache on keystone - HA setup not based on RefArch in which each service has 3 copies. One of the controllers is powered off and then, we can see those messages on nova: packages/neutronclient/v2_0/client.py", line 211, in do_request 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] self._handle_fault_response(status_code, replybody) 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] exception_handler_v20(status_code, des_error_body) 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 83, in exception_handler_v20 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] message=message) 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: 646606a5-cdbe-4b62-a5b7-9b2df83d434e] NeutronClientException: Authentication required And those ones on keystone: 2016-02-17 08:49:19.743 2630 WARNING keystone.middleware.core [-] RBAC: Invalid token 2016-02-17 08:49:19.744 2630 WARNING keystone.common.wsgi [-] The request you have made requires authentication. (Disable debug mode to suppress these details.) 2016-02-17 08:49:19.745 2630 INFO eventlet.wsgi.server [-] 10.1.255.254 - - [17/Feb/2016 08:49:19] "GET /v2.0/tokens/156a3a2d871b4ab087ad7301b31fd948 HTTP/1.1" 401 424 0.002914 2016-02-17 08:49:19.747 2630 DEBUG keystone.middleware.core [-] Auth token not in the request header. Will not build auth context. process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:229 Which seems to be https://bugs.launchpad.net/nova/+bug/1491905 ( https://review.openstack.org/#/c/220207/ ) Version-Release number of selected component (if applicable): openstack-nova-common-2015.1.1-3.el7ost.noarch openstack-nova-compute-2015.1.1-3.el7ost.noarch How reproducible: On a 3 instance-for-each-OSP service HA setup (not our refarch), if a simulation of a controller crash is performed (powered off via iLO) nova service gets into blocking state on the 3 computes. In order to unblock, a nova-compute.service restart is required on each node.
(In reply to Pablo Iranzo Gómez from comment #0) > Description of problem: > > Environment: > - Memcache on keystone > - HA setup not based on RefArch in which each service has 3 copies. > > Which seems to be https://bugs.launchpad.net/nova/+bug/1491905 ( > https://review.openstack.org/#/c/220207/ ) I'm not sure those are really related, con you confirm me that Keystone+memcached is running on a different node that the 3 compute services ? (according to the sosreport you shared with us it seems to be the case) > How reproducible: > On a 3 instance-for-each-OSP service HA setup (not our refarch), if a > simulation of a controller crash is performed (powered off via iLO) nova > service gets into blocking state on the 3 computes. > In order to unblock, a nova-compute.service restart is required on each node. We need some clarifications, are we talking about controller nodes (api, scheduler...) or compute nodes ? What is powered off ? > packages/neutronclient/v2_0/client.py", line 211, in do_request > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] > self._handle_fault_response(status_code, replybody) > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] File > "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, > in _handle_fault_response > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] exception_handler_v20(status_code, > des_error_body) > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] File > "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 83, in > exception_handler_v20 > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] message=message) > 2016-02-17 07:37:36.960 11318 TRACE nova.compute.manager [instance: > 646606a5-cdbe-4b62-a5b7-9b2df83d434e] NeutronClientException: Authentication > required > > > And those ones on keystone: > > 2016-02-17 08:49:19.743 2630 WARNING keystone.middleware.core [-] RBAC: > Invalid token > 2016-02-17 08:49:19.744 2630 WARNING keystone.common.wsgi [-] The request > you have made requires authentication. (Disable debug mode to suppress these > details.) > 2016-02-17 08:49:19.745 2630 INFO eventlet.wsgi.server [-] 10.1.255.254 - - > [17/Feb/2016 08:49:19] "GET /v2.0/tokens/156a3a2d871b4ab087ad7301b31fd948 > HTTP/1.1" 401 424 0.002914 > 2016-02-17 08:49:19.747 2630 DEBUG keystone.middleware.core [-] Auth token > not in the request header. Will not build auth context. process_request > /usr/lib/python2.7/site-packages/keystone/middleware/core.py:229 So here we can read that no token are provided from the request to Keystone. The dates in log messages from nova and neutron do not coincide. Can we have some clarification about the env ? (controller nodes, computes nodes, keystones nodes) Where memcached is installed (I guess it's on Keystone node) Is this memcached is shared with other components ? Thanks s
Any feedback on my last comment/questions ?
A request to the reporter about to get more information has been initiate three weeks ago. We are going to close this issue, please feel free to re-open with further information whether problem is persisting.